Understanding Slow SQL Queries
Slow SQL queries refer to MySQL statements that exceed the long_query_time threshold. While MySQL maintains various log types including binary logs, relay logs, redo logs, and undo logs, the slow query log specifically records statements with response times surpassing the configured threshold. It's important to note that slow queries aren't limited to SELECT statements - INSERT, UPDATE, and other DML operations can also be classified as slow if they exceed the time limit.
# Check if slow query logging is enabled
SHOW VARIABLES LIKE "slow_query_log%";
# View the slow query threshold (in seconds)
SHOW VARIABLES LIKE "long_query_time";
For AliSQL-X-Cluster (XDB) environments, slow query logging is enabled by default with a 1-second threshold.
The Impact of Slow Queries
Actual slow queries often involve extensive row scans, temporary file sorting, or frequent disk flush operations, leading to increased disk I/O. This can cause normal queries to become slow, resulting in widespread timeouts. During the post-Double 11 technical review, our team was tasked with addressing slow SQL issues across various applications.
Measuring Slow Query Severity
Micro Average
The micro average calculates the ratio of slow queries to total queries:
sum(application_slow_queries) / sum(application_total_queries)
This metric ranges from 0 (no slow queries) to 1 (all queries are slow). However, it has limitations, especially for high-QPS applications where occasional slow queries may be overlooked.
Macro Average
The macro average provides a more nuanced view:
(sum(slow_query_1) / sum(total_query_1) + sum(slow_query_2) / sum(total_query_2) + ... + sum(slow_query_n) / sum(total_query_n)) / n
This approach reduces the impact of occasional slow queries but can be affected by low-QPS applications.
Daily Average
To address statistical anomalies, we track the average daily slow query count over a week:
sum(application_slow_queries) / 7
Unique Slow Query Templates
We also monitor the number of distinct slow query templates to track historical patterns:
count(distinct(application_slow_query_templates))
Optimization Goals
- Core applications: Eliminate all slow queries
- Regular applications: Reduce micro average by 50%
Performance Reporting
We generate weekly reports ranking CTO departments based on multi-dimensional metrics, highlighting top and bottom performers.
MySQL Best Practices
Indexing Guidelines
- Join Operations: Limit joins to three tables maximum. Ensure joined fields have identical data types and proper indexing.
- String Indexes: Specify index length for VARCHAR fields. A length of 20 typically provides 90%+ distinction rate.
- Search Optimization: Avoid left or full text searches in application queries; use search engines instead.
- Type Consistency: Prevent implicit conversions that can invalidate indexes.
Database Change Management
DDL operations require controlled execution with proper gray release strategies and must be scheduled within approved maintenance windows.
Optimization Examples
Case 1: Data Distribution Issues
An application distributed data across 8 databases with 16 tables each, but data was heavily concentrated in just 2 tables per database. The sharding strategy needed revision.
Case 2: Index Optimization
The following query was performing poorly despite having a store_code index:
SELECT
COUNT(0) AS `tmp_count`
FROM (
SELECT
`inventory_table`.`user_id`,
`inventory_table`.`item_id`,
SUM(
CASE
`inventory_table`.`type`
WHEN 1 THEN `inventory_table`.`quantity`
ELSE 0
END
) AS `available_quantity`,
SUM(
CASE
`inventory_table`.`type`
WHEN 1 THEN `inventory_table`.`locked_quantity`
ELSE 0
END
) AS `locked_quantity`,
SUM(
CASE
`inventory_table`.`type`
WHEN 401 THEN `inventory_table`.`quantity`
ELSE 0
END
) AS `in_transit_quantity`,
`inventory_table`.`warehouse_code`,
MAX(`inventory_table`.`last_modified`) AS `last_modified`
FROM
`inventory_table`
WHERE(`inventory_table`.`is_deleted` = 0)
AND(`inventory_table`.`quantity` > 0)
AND `inventory_table`.`user_id` IN(3405569954)
AND `inventory_table`.`warehouse_code` IN('ZJJHBHYTJJ0001', '...1000+ codes')
GROUP BY
`inventory_table`.`user_id`,
`inventory_table`.`item_id`
ORDER BY
`inventory_table`.`user_id` ASC,
`inventory_table`.`item_id` ASC
) `a`;
Analysis revealed that MySQL was performing full table scans when the IN clause contained more than 200 values. The solution involved creating a composite index on (is_deleted, quantity).
Index Length Calculation
MySQL index length calculation (for versions ≥5.6.4):
-- Character field with NULL allowance
char(10)允许NULL = 10 * (character set: utf8mb4=4,utf8=3,gbk=2,latin1=1) + 1(NULL)
-- Character field without NULL allowance
char(10)不允许NULL = 10 * (character set: utf8mb4=4,utf8=3,gbk=2,latin1=1)
-- Variable character field with NULL allowance
varchar(10)允许NULL = 10 * (character set: utf8mb4=4,utf8=3,gbk=2,latin1=1) + 1(NULL) + 2(variable length)
-- Variable character field without NULL allowance
varchar(10)不允许NULL = 10 * (character set: utf8mb4=4,utf8=3,gbk=2,latin1=1) + 2(variable length)
-- Integer field with NULL allowance
int允许NULL = 4 + 1(NULL)
-- Integer field without NULL allowance
int不允许NULL = 4
-- Timestamp field with NULL allowance
timestamp允许NULL = 4 + 1(NULL)
-- Timestamp field without NULL allowance
timestamp不允许NULL = 4
-- Datetime field with NULL allowance
datetime允许NULL = 5 + 1(NULL)
-- Datetime field without NULL allowance
datetime不允许NULL = 5
Case 3: Resource Contention
Even with proper indexing, queries can become slow due to resource contention. We observed cases where high disk I/O from other MySQL instances on the same physical machine affected query performance, despite Docker-based resource isolation.
Case 4: Complex Query Scenarios
Some scenarios, particularly those involving multiple optional search fields, may not be optimizable through indexing alone. In such cases, consider:
- Making high-distinction fields mandatory in search forms
- Implementing search-oriented storage solutions or dedicated search engines
Establishing Routine Maintenance
Through systematic optimization efforts, we've achieved significant improvements in query performance. Core applications now report zero slow queries, and our team's ranking improved from bottom to top performers. We've established weekly slow query ticket processing, with owners assigned to address and resolve issues, creating a culture of continuous optimization.