Even though MySQL 5.6 introduced materialization for query optimization, this improvement only applies to read-only SELECT statements. For UPDATE and DELETE operations, you must manually rewrite dependent nested subqueries to use JOIN patterns to get good performance.
Use EXPLAIN execution plans to identify scenarios where indexes are rendered ineffective. Pay close attention if the select_type field in the EXPLAIN output shows DEPENDENT SUBQUERY.
The following example query looks for students that have not yet had a course enrollment record created:
EXPLAIN SELECT s.student_id FROM student s
WHERE s.is_deleted = 0
AND s.error_count < 10
AND (SELECT COUNT(*) FROM course_enrollments ce WHERE ce.student_id = s.student_id) < 1
ORDER BY s.student_id DESC;
The DEPENDENT SUBQUERY select type indicates that the nested subquery included in the WHERE clause relies on row values from the outer query. Offficial definitions for these types are:
SUBQUERY: The first SELECT in a nested subquery that is independent of the outer queryDEPENDENT SUBQUERY: The first SELECT in a nested subquery that depends on values fetched by the outer query
The default execution flow for this query works like this: MySQL first retrieves all outer query rows matching the is_deleted and error_count filters, then runs the subquery once for every outer row returned. The subquery's execution count is directly tied to the number of rows from the outer query, leading to poor scaling for large datasets.
This query can be optimized by rewriting it to use a LEFT JOIN instead:
EXPLAIN SELECT s.student_id FROM student s
LEFT JOIN course_enrollments ce
ON s.student_id = ce.student_id
WHERE s.is_deleted = 0
AND s.error_count < 10
AND IFNULL(ce.enrollment_id, 0) = 0;
Checking the new execution plan shows both query nodes now have a select_type of SIMPLE, eliminating the dependent subquery overhead.
A common point of confusion around MySQL subquery performance comes from how MySQL rewrites IN subqueries. Developers generally expect subqueries to execute from the inside out: the inner query runs first to produce a result set, which is then used to filter the outer query. For example, take the following IN subquery:
SELECT * FROM main_table WHERE row_id IN (
SELECT foreign_id FROM related_table WHERE group_id = 10
);
Most developers expect this to execute by first fetching all matching foreign_id values from related_table, then filtering main_table against that list of values.
But MySQL actually rewrites this query into a correlated EXISTS structure, as shown below:
SELECT * FROM main_table WHERE EXISTS (
SELECT * FROM related_table
WHERE group_id = 10 AND related_table.foreign_id = main_table.row_id
);
With this rewrite, MySQL scans every row from the outer main_table and runs an inner lookup for each individual row. The inner subquery never executes as a single standalone step. If the outer table is large, this behavior creates severe performance issues.