Database Indexing and Query Execution Plans

Purpose of Indexes

Indexes serve a similar function to a book's table of contents, designed to enhance query performance.

Types of Index Algorithms

B-tree indexes
Hash indexes
R-tree indexes
Full-text indexes
GIS indexes

B-tree Index Classification

How Secondary Indexes Build B-tree Structure

-- Process steps:
-- 1. Extract column values from the table to generate B-tree structure
-- 2. Sort all values automatically
-- 3. Distribute sorted values evenly to leaf nodes (16K)
-- 4. Generate pointers to corresponding data pages
-- 5. Create branch and root nodes based on data volume and key length

-- Example table structure
id  name  age  gender
SELECT * FROM t1 WHERE id=10;

-- Note: ID column queries use sequential I/O, while other columns may use random I/O

Clustered Indexes

Prerequisites

-- 1. Primary key column automatically becomes clustered index
-- 2. If no primary key, unique key becomes clustered index
-- 3. Clustered index should be defined during table creation, typically using irrelevant column (ID)

How Secondary Indexes Build B-tree Structure

-- 1. Set primary key column (ID) during table creation
-- 2. Data will be stored on disk ordered by ID column (clustered index organized table)
-- 3. Sorted complete rows become leaf nodes (data pages = leaf nodes)

Differences Between Clustered and Secondary Indexes

Any column can have secondary indexes when needed, as long as names differ
Only one clustered index per table, usually the primary key
Secondary index leaf nodes store only sorted index column values + clustered index column values
Clustered index leaf nodes store complete sorted row data
MySQL stores table data as clustered index organized tables

Secondary Index Subtypes

Regular single-column secondary index
Composite indexes: multiple columns as index conditions, theoretically well-designed to reduce table lookups
Unique indexes: all index column values are unique

Factors Affecting Index Tree Height

-- 1. Data volume - Solution: partitioning, sharding, distributed databases
-- 2. Long index column values - Solution: prefix indexes
-- 3. Data types:
-- Variable-length strings using CHAR - Solution: use VARCHAR for variable strings
-- ENUM type usage: enum('Shandong','Hebei','Heilongjiang','Jilin','Liaoning','Shaanxi'...)

Basic Index Management

Querying Indexes

DESC city;
-- OR
SHOW INDEX FROM city\G;

Single-column Regular Secondary Index

ALTER TABLE city ADD INDEX idx_name(name);

Multi-column Composite Index

ALTER TABLE city ADD INDEX idx_c_p(countrycode,population);

Unique Index

ALTER TABLE city ADD UNIQUE INDEX uidx_dis(district);

Prefix Index

ALTER TABLE city ADD INDEX idx_dis(district(5));

Dropping Indexes

ALTER TABLE city DROP INDEX idx_name;
ALTER TABLE city DROP INDEX idx_c_p;
ALTER TABLE city DROP INDEX idx_dis;

Index Optimization Testing

Before Optimization - 755s

mysqlslap --defaults-file=/etc/my.cnf \
--concurrency=100 --iterations=1 --create-schema='test' \
--query="select * from test.t100w where k2='MN89'" engine=innodb \
--number-of-queries=2000 -uroot -p123 -verbose

# Result: Average 755.861 seconds

After Optimization - 1.67s

mysqlslap --defaults-file=/etc/my.cnf --concurrency=100 --iterations=1 --create-schema='test' --query="select * from test.t100w where k2='MN89'" engine=innodb \
--number-of-queries=2000 -uroot -p123 -verbose

# Result: Average 1.678 seconds

Execution Plan Analysis

Purpose

Extract the optimizer-selected execution plan for database administrator evaluation of statement efficiency.

Obtaining Execution Plans

DESC SQL_statement
EXPLAIN SQL_statement

DESC SELECT * FROM school.student WHERE sname='zhang3';

Key Information to Focus On

-- table: city                              --> queried table    **
-- possible_keys: CountryCode,idx_co_po      --> potential indexes  **
-- key: CountryCode                          --> actual index used    ***
-- type: all                                --> index type        *****
-- Extra: Using index condition              --> additional info        *****

Query Types (Type)

Full table scan: ALL
Index scans: index, range, ref, eq_ref, const(system), NULL

-- index: full index scan
SELECT countrycode FROM city;

-- range: index range scan (> < >= <=, BETWEEN AND, OR, IN, LIKE)
SELECT * FROM city WHERE id>2000;
SELECT * FROM city WHERE countrycode LIKE 'CH%';

-- For secondary indexes, != and NOT IN don't use indexes
-- For primary key indexes, != and NOT IN use range

-- Examples with IN/OR
SELECT * FROM city WHERE countrycode='CHN' OR countrycode='USA';
SELECT * FROM city WHERE countrycode IN ('CHN','USA');

-- Better rewritten as UNION ALL
SELECT * FROM city WHERE countrycode='CHN' 
UNION ALL 
SELECT * FROM city WHERE countrycode='USA';

-- ref: secondary index equality lookup
-- eq_ref: multi-table joins where child table uses primary/unique key
SELECT b.name,a.name,a.population  
FROM city AS a 
JOIN country AS b ON a.countrycode=b.code 
WHERE a.population<100;

-- const(system): primary/unique key equality lookup
SELECT * FROM city WHERE id=100;

Extra Field Explanations

filesort in Extra indicates file sorting occurred
Check if ORDER BY, GROUP BY, DISTICNT conditions have indexes
Create composite indexes based on clause execution order

EXPLAIN/DESC Usage Scenarios

-- Interview question: Company's business is slow, analyze database causes
-- Two scenarios:

-- Emergency slowness (sudden hang):
-- 1. SHOW PROCESSLIST; get statements causing database hang
-- 2. EXPLAIN to analyze execution plan and index usage
-- 3. Create indexes, modify statements

-- Persistent slowness:
-- 1. Log slow queries to slowlog, analyze slowlog
-- 2. EXPLAIN to analyze execution plan and index usage
-- 3. Create indexes, modify statements

Key Length Coverage

For utf8mb4:

VARCHAR(10): nullable(1) + 4*10 + 2 = 43 bytes
CHAR(10): nullable(1) + 4*10 = 41 bytes
INT: nullable(1) + 4 = 5 bytes

Purpose: Determine composite index coverage length - generally longer is better.

VARCHAR(4) utf8mb4: 44 + 3 bytes CHAR(4): 44 + 1 byte

Characteristics:

Can store 4 arbitrary characters
Regardless of character, numeric, or Chinese, maximum reserved length is 4 bytes per character
Each Chinese character occupies 4 bytes
Numbers and letters actually occupy 1 byte each

SELECT LENGTH(column_name) FROM test_table;

Key Length Application Focus

For composite index sequence (k1,k2,k3,k4):

When all index columns use equality condtiions, order doesn't matter due to optimizer rearrangement:

-- Both have same key_length = 56
SELECT * FROM test WHERE k1='aa' AND k2='China' AND k3='aaaa' AND k4='Hello China';
SELECT * FROM test WHERE k2='China' AND k3='aaaa' AND k4='Hello China' AND k1='aa';

For non-contiguous conditions missing k2:

-- Only k1 condition met
SELECT * FROM test WHERE k1='aa' AND k3='aaaa' AND k4='Hello China';

Conditions That Don't Use Indexes

-- <> and NOT IN don't use secondary indexes
EXPLAIN SELECT * FROM telephone_table WHERE tel_number <> '110';
EXPLAIN SELECT * FROM telephone_table WHERE tel_number NOT IN ('110','119');

-- >, <, >=, <=, NOT IN don't use secondary indexes
-- Individual >, <, IN may or may not use indexes depending on result set size

-- OR/IN should be rewritten as UNION
EXPLAIN SELECT * FROM telephone_table WHERE tel_number IN ('110','119');
-- Rewrite as:
EXPLAIN SELECT * FROM telephone_table WHERE tel_number='110'
UNION ALL
SELECT * FROM telephone_table WHERE tel_number='119';

-- LIKE with leading % doesn't use indexes
EXPLAIN SELECT * FROM telephone_table WHERE tel_number LIKE '31%';  -- Uses range scan
EXPLAIN SELECT * FROM telephone_table WHERE tel_number LIKE '%110'; -- No index used

-- For %search% patterns, consider Elasticsearch + MongoDB for dedicated search services

Index Usage Guidelines

Index Creation Principles

Tables must have primary keys, typically irrelevant auto-incrementing columns
Columns frequently used in WHERE conditions, ORDER BY, GROUP BY, JOIN ON, DISTINCT
Use columns with many unique values as leading columns in composite indexes
For long index columns, use prefix indexes
Reduce index entries: remove unused indexes, clean infrequently used ones
Perform index maintenance during off-peak hours
Don't create indexes for small tables

Conditions Where Indexes Aren't Used

No query conditions or conditions lack indexes
Query results represent more than 25% of original table data
Index failure due to outdated statistics
Functions on indexed columns or arithmetic operations
Implicit conversions causing index failure
<> and NOT IN don't use secondary indexes
LIKE with leading % doesn't use indexes
Composite index issues

Tags: database indexing B-Tree Query-Optimization execution-plan

Posted on Sat, 30 May 2026 17:45:23 +0000 by jamz310

Freaks City

Database Indexing and Query Execution Plans

Purpose of Indexes

Types of Index Algorithms

B-tree Index Classification

How Secondary Indexes Build B-tree Structure

Clustered Indexes

Prerequisites

How Secondary Indexes Build B-tree Structure

Differences Between Clustered and Secondary Indexes

Secondary Index Subtypes

Factors Affecting Index Tree Height

Basic Index Management

Querying Indexes

Single-column Regular Secondary Index

Multi-column Composite Index

Unique Index

Prefix Index

Dropping Indexes

Index Optimization Testing

Before Optimization - 755s

After Optimization - 1.67s

Execution Plan Analysis

Purpose

Obtaining Execution Plans

Key Information to Focus On

Query Types (Type)

Extra Field Explanations

EXPLAIN/DESC Usage Scenarios

Key Length Coverage

Key Length Application Focus

Conditions That Don't Use Indexes

Index Usage Guidelines

Index Creation Principles

Conditions Where Indexes Aren't Used

Hot Tags