Optimizing MySQL Query Performance for Hundreds of Millions of Rows

Optimizing MySQL query efficiency when dealing with hundreds of millions of rows requires a comprehensive approach that includes indexing, query rewriting, partitioning, and hardware configuration. Below are best practices and examples to improve query performance on large datasets.

MySQL query optimization diagram

1. Introduction

Processing large-scale data demands efficient queries. Optimized queries reduce response times and resource consumption, boosting overall system performance. This article covers multiple strategies with code examples.

2. Index Optimization

Index Types

MySQL supports several index types:

B-Tree: Suitable for most quereis.
Hash: Best for exact matches.
Full-text: For text search.
Spatial: For geospatial data.

Index Design Best Practices

Create indexes on columns used in WHERE clauses.
Use composite indexes for multi-column conditions.
Avoid indexing columns with low selectivity (many duplicates).
Regularly maintain indexes (rebuild, drop unused).

Example:

-- Create single-column indexes
CREATE INDEX idx_user_id ON users(user_id);
CREATE INDEX idx_user_age ON users(age);

-- Create a composite index
CREATE INDEX idx_user_name_age ON users(name, age);

3. Query Optimization

Query Rewriting

Rewrite queries to improve performence:

Avoid SELECT *; fetch only required columns.
Prefer JOIN over subqueries.
Use EXISTS instead of IN when checking for existence.

Before:

SELECT * FROM users 
WHERE user_id IN (SELECT user_id FROM orders WHERE order_date > '2023-01-01');

After:

SELECT u.* FROM users u
JOIN orders o ON u.user_id = o.user_id
WHERE o.order_date > '2023-01-01';

Efficient Pagination

Avoid large OFFSET values. Use keyset pagination for better performance.

Traditional (slow for large offsets):

SELECT * FROM users ORDER BY user_id LIMIT 1000 OFFSET 5000;

Keyset pagination (faster):

SELECT * FROM users WHERE user_id > 5000 ORDER BY user_id LIMIT 1000;

4. Partitioning

Partition large tables into smaller, manageable pieces.

Horizontal Partitioning (by range)

CREATE TABLE users (
    user_id INT,
    name VARCHAR(50),
    age INT,
    PRIMARY KEY (user_id, age)
) PARTITION BY RANGE (age) (
    PARTITION p0 VALUES LESS THAN (20),
    PARTITION p1 VALUES LESS THAN (30),
    PARTITION p2 VALUES LESS THAN (40),
    PARTITION p3 VALUES LESS THAN (50),
    PARTITION p4 VALUES LESS THAN MAXVALUE
);

Vertical Partitioning (split columns into separate tables)

-- Base table with core columns
CREATE TABLE users_basic (
    user_id INT PRIMARY KEY,
    name VARCHAR(50)
);

-- Extension table with additional columns
CREATE TABLE users_extra (
    user_id INT,
    age INT,
    FOREIGN KEY (user_id) REFERENCES users_basic(user_id)
);

5. Hardware and Configuration

Hardware Considerations

CPU: Multi-core, high-frequency processors improve concurrency.
Memory: Large RAM allows caching and reduces I/O.
Storage: SSDs significantly enhance read/write speeds over HDDs.

MySQL Configuration

Adjust key parameters in my.cnf:

[mysqld]
innodb_buffer_pool_size = 4G
max_connections = 500
query_cache_size = 0

6. Complete Example

Combine indexing, query rewriting, and partitioning:

-- Create users table
CREATE TABLE users (
    user_id INT PRIMARY KEY,
    name VARCHAR(50),
    age INT
);

-- Create partitioned orders table by year
CREATE TABLE orders (
    order_id INT PRIMARY KEY,
    user_id INT,
    order_date DATE,
    amount DECIMAL(10, 2)
) PARTITION BY RANGE (YEAR(order_date)) (
    PARTITION p2022 VALUES LESS THAN (2023),
    PARTITION p2023 VALUES LESS THAN (2024),
    PARTITION p2024 VALUES LESS THAN (2025)
);

-- Create indexes
CREATE INDEX idx_user_name ON users(name);
CREATE INDEX idx_order_date ON orders(order_date);

-- Optimized query
SELECT u.* FROM users u
JOIN orders o ON u.user_id = o.user_id
WHERE o.order_date > '2023-01-01';

7. Conclusion

Optimizing MySQL for hundreds of millions of rows involves a holistic approach: proper indexing, query rewriting, partitioning, and hardware/configuration tuning. The techniques and examples provided help reduce latency and improve throughput. Apply them to achieve better performance in your large-scale databases.

Tags: MySQL query optimization Index partitioning Database Performance

Posted on Fri, 15 May 2026 01:33:21 +0000 by Clinger

Freaks City