Database Index Fundamentals and Mathematical Theory
The Nature of Indexes
The official MySQL definition describes an index as a data structure that enables efficient data retrieval. In essence, an index is simply a carefully organized data structure.
Database querying represents one of the most critical operations in any database system. The goal is to make data retrieval as fast as possible, which has led database designers to explore various lookup algorithms. While linear search with O(n) complexity works, it performs poorly with large datasets. Computer science offers superior alternatives like binary search and binary tree search, each requiring specific data organization.
However, these algorithms impose strict requirements on data structure. Binary search demands sorted data, while binary tree search works only on binary search trees. Since real-world data cannot perfectly conform to these constraints, database systems maintain additional data structures that reference the actual data, enabling advanced lookup algorithms.
Consider a practical example: a table with two columns and seven rows. To optimize lookups on the second column, a binary search tree can be maintained where each node contains an index key value and a pointer to the physical data record location. This structure enables O(log₂n) complexity for retrieval operations.
Interestingly, production database systems rarely use binary search trees or their evolved forms like red-black trees. The subsequent sections explain why.
B-Tree and B+Tree Structures
Modern database systems and file systems predominantly use B-Tree or its variant B+Tree for index implementation. The following analysis explains this design choice from both data structure and computer architecture perspectives.
B-Tree Definition
A B-Tree organizes data as key-value pairs where each record contains a unique key and associated data. A B-Tree with degree d (d > 1) satisfies these conditions:
- Each internal node contains n-1 keys and n pointers, where d ≤ n ≤ 2d
- Leaf nodes contain atleast one key and two pointers, with a maximum of 2d-1 keys and 2d pointers
- All leaf nodes share the same depth, equal to tree height h
- Keys and pointers alternate, with pointers at both ends of each node
- Keys within a node are stored in non-decreasing order
- Pointers either reference other nodes or remain null
The leftmost pointer references keys smaller than the first key, while the rightmost pointer references keys greater than the last key. Intermediate pointers reference keys between adjacent key values.
B-Tree Search Algorithm
Search operations in B-Trees follow a straightforward pattern: begin at the root, perform binary search on keys, and recursively traverse the appropriate subtree until reaching the target node or encountering a null pointer.
function btreeSearch(node, searchKey):
if node is null:
return null
for i from 0 to node.keyCount:
if node.keys[i] equals searchKey:
return node.values[i]
if node.keys[i] greater than searchKey:
return btreeSearch(node.pointers[i].child)
return btreeSearch(node.pointers[node.keyCount].child)
result = btreeSearch(rootNode, targetKey)
B-Trees demonstrate remarkable efficiency: with a degree of d indexing N keys, the maximum tree height equals log_d((N+1)/2), and search complexity reaches O(log_dN). This mathematical property makes B-Trees highly efficient for indexing purposes.
B+Tree Characteristics
B+Tree represents the most common B-Tree variant, serving as the foundation for MySQL's index implementation. Key differences from B-Tree include:
- Internal node pointer capacity reaches 2d instead of 2d+1
- Internal nodes store only keys without data
- Leaf nodes contain all data and optional forward pointers for sequential access
This design maximizes the fan-out ratio for internal nodes while enabling efficient range queries through linked leaf nodes.
Optimized B+Tree with Sequential Access
Database systems typically extend B+Tree with sequential access pointers connecting adjacent leaf nodes. This optimization dramatically improves range query performance by enabling forward traversal without returning to parent nodes.
Theoretical Foundation: Why B+Tree Dominates Index Implementation
While red-black trees and other balanced tree structures could theoretically serve as indexes, production systems overwhelmingly favor B+Tree. This preference stems from fundamental differences in disk I/O characteristics.
Memory versus Disk Access
Random access memory (RAM) provides fast, essentially latency-equal access regardless of data location. However, indexes typically exceed memory capacity and must persist on disk.
Disk access involves mechanical operations including:
- Seek time: moving the read-write head to the correct track
- Rotational latency: waiting for the target sector to rotate under the head
- Transfer time: actually reading or writing the data
These mechanical operations create access times measured in milliseconds, compared to nanoseconds for memory access—a difference of several orders of magnitude.
Locality Principle and Read-Ahead
Modern storage systems leverage the locality principle: when data at a particular location is accessed, nearby data will likely be accessed soon. This enables effective prefetching strategies.
Storage systems read data in page-sized blocks (commonly 4KB in operating systems), meaning a single disk I/O retrieves a complete page regardless of whether only one byte is needed.
B+Tree Performance Analysis
B+Tree achieves optimal performance by matching the tree node size to disk page size, ensuring each node requires exactly one disk I/O to load. With degree d typically exceeding 100 in practice, tree height rarely exceeds 3-4 levels, meaning any lookup requires only 3-4 disk I/O operations.
In contrast, balanced trees like red-black trees suffer from poor locality. Physically nearby nodes in the tree structure may be scattered across different disk locations, eliminating the benefits of prefetching and caching.
The fan-out ratio in B+Tree directly impacts performance:
maxFanout = floor(pageSize / (keySize + pointerSize))
By eliminating data storage from internal nodes, B+Tree maximizes fan-out, allowing each internal node to reference more child nodes and reducing overall tree height.
MySQL Index Implementation Across Storage Engines
MySQL supports multiple storage engines, each implementing indexes differently. The two primary engines, MyISAM and InnoDB, demonstrate distinct approaches.
MyISAM Index Architecture
MyISAM implements indexes using B+Tree structures where leaf nodes store record addresses rather than actual data. The index file maintains only pointers to data locations in the separate data file.
Primary and secondary indexes in MyISAM share identical structural characteristics—the distinction lies only in uniqueness constraints. Primary indexes require unique keys, while secondary indexes may contain duplicate values.
The lookup process follows two steps: first, search the index B+Tree to obtain the record address; second, use that address to fetch the actual data record.
This approach earns the designation "non-clustered" index because the physical data arrangement does not correspond to index ordering.
InnoDB Index Architecture
InnoDB fundamentally differs by making the data file itself an index structure. The primary key B+Tree leaf nodes contain complete data records rather than pointers.
This design creates a "clustered" index where data physically organizes according to primary key sequence. Consequently, InnoDB mandates primary key existence—if no explicit primary key is defined, MySQL automatically generates a hidden six-byte long integer primary key.
Secondary indexes in InnoDB differ critically: they store primary key values in their leaf nodes rather than data file addresses. This design means secondary index lookups require two phases: first retrieving the primary key from the secondary index, then using that primary key to access the complete record through the primary index.
This architectural choice has significant implications:
- Longer primary keys inflate all secondary index size
- Non-sequential primary keys cause frequent page splits during inserts
- Auto-incrementing primary keys provide optimal insertion performance
Index Usage Strategies and Optimization
Understanding index behavior enables strategic optimization decisions.
Composite Index and Leftmost Prefix Principle
MySQL supports composite indexes spanning multiple columns, stored as ordered tuples. The leftmost prefix principle governs which queries can utilize these indexes effectively.
Consider a table with composite index (emp_no, title, from_date). Query patterns demonstrate varying index utilization:
Scenario One: Full Column Match
EXPLAIN SELECT * FROM employees.titles
WHERE emp_no='10001'
AND title='Senior Engineer'
AND from_date='1986-06-26';
This query uses all three index columns with exact matches, achieving optimal key_len of 59 bytes.
Scenario Two: Leftmost Prefix Match
EXPLAIN SELECT * FROM employees.titles
WHERE emp_no='10001';
This query uses only the first index column, demonstrating key_len of 4 bytes. The query satisfies the leftmost prefix requirement but cannot utilize subsequent columns.
Scenario Three: Missing Intermediate Column
EXPLAIN SELECT * FROM employees.titles
WHERE emp_no='10001'
AND from_date='1986-06-26';
While this query appears to use the index partially, the missing title column breaks the prefix continuity, forcing additional filtering. The key_len indicates only the emp_no portion is utilized efficiently.
Workarounds include creating a covering index (emp_no, from_date) or using IN clauses to enumerate possible title values when the set remains small.
Scenario Four: Non-Prefix Column
EXPLAIN SELECT * FROM employees.titles
WHERE from_date='1986-06-26';
This query cannot use the index because it violates the leftmost prefix principle—no index column other than the first appears in the WHERE clause.
Scenario Five: Prefix String Matching
EXPLAIN SELECT * FROM employees.titles
WHERE emp_no='10001'
AND title LIKE 'Senior%';
Leading wildcards (%Senior) prevent index usage, while trailing wildcards (Senior%) enable range scans using the index.
Scenario Six: Range Queries
EXPLAIN SELECT * FROM employees.titles
WHERE emp_no < '10010'
AND title='Senior Engineer';
Range conditions on the leftmost column permit index usage, but subsequent columns cannot leverage the index. MySQL restricts index usage to a single range column per query.
Scenario Seven: Functions and Expressions
EXPLAIN SELECT * FROM employees.titles
WHERE emp_no='10001'
AND LEFT(title, 6)='Senior';
Applying functions to indexed columns prevents index utilization entirely, even when mathematically equivalent conditions exist. The query optimizer cannot transform expressions to use indexes.
Index Selectivity and Prefix Indexes
Index selectivity measures the ratio of distinct index values to total table rows, ranginng from 0 to 1. Higher selectivity indicates more valuable indexes.
SELECT COUNT(DISTINCT column_name)) / COUNT(*) AS selectivity
FROM table_name;
Very low selectivity (below 0.01) typically indicates poor index candidates. Prefix indexing offers a solution by using only initial column characters:
ALTER TABLE employees.employees
ADD INDEX idx_name_prefix (first_name, last_name(4));
This approach balances selectivity against storage efficiency, trading some selectivity for reduced index size. The technique proves particularly valuable for long string columns where full-length indexing would consume excessive space.
Prefix indexes carry limitations: they cannot support ORDER BY or GROUP BY operations efficiently, and cannot serve as covering indexes since the complete column value remains unavailable.
Primary Key Selection and Insert Performance
InnoDB storage fundamentally benefits from sequential primary keys. When records insert with monotonically increasing values, new rows append to existing pages until reaching the fill factor threshold (15/16 by default), then create new pages. This behavior maintains compact, contiguous storage.
Conversely, random primary keys (such as UUIDs or natural keys from distributed data) force insertions into arbitrary positions within existing pages, triggering page splits, increased I/O operations, and storage fragmentation. Periodic OPTIMIZE TABLE becomes necessary to reclaim efficiency.
Best practice dictates using auto-increment integer primary keys for InnoDB tables whenever possible.