Understanding Phantom Reads in MySQL InnoDB

Phantom Reads in MySQL

Understanding the Problem

Consider a scenario using InnoDB with the default REPEATABLE READ isolation level:

CREATE TABLE `user_data` (
  `id` int(11) NOT NULL,
  `score` int(11) DEFAULT NULL,
  `grade` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `score` (`score`)
) ENGINE=InnoDB;

INSERT INTO user_data VALUES 
(0,0,0), (5,5,5), (10,10,10), 
(15,15,15), (20,20,20), (25,25,25);
BEGIN;
SELECT * FROM user_data WHERE grade = 5 FOR UPDATE;
COMMIT;

The query targets rows where grade equals 5, which corresponds to id=5. After execution, a write lock is placed on id=5. Due to the two-phase locking protocol, this lock is released when COMMIT executes. However, since grade lacks an index, the query performs a full table scan. The critical question: are locks applied to scanned rows that don't match the condition?

What is a Phantom Read?

A phantom read occurs when a transaction executes the same query twice and sees different rows in the second execution—rows that didn't exist in the first result set. Specifically, phantom reads refer to newly inserted rows that appear in subsequent queries.

In REPEATABLE READ isolation, regular SELECT statements use snapshot reads and won't see uncommitted inserts from other transactions. Phantom reads only manifest under current reads (SELECT ... FOR UPDATE).

Why Phantom Reads Matter

Semantic Issues

Session A declares an intent to lock all rows where score=5, preventing any read or write operations. However, this semantic guarantee breaks:

  • Session B updates id=0, changing its score to 5
  • Session A only locked id=5 initially, not id=0
  • Session B's update succeeds, violating Session A's locking intention

Data Consistency Problems

Consider the following execution sequence:

  1. T1: Session A executes SELECT * FROM user_data WHERE score=5 FOR UPDATE
  2. T1: Session A adds UPDATE user_data SET grade=100 WHERE score=5
  3. T2: Session B updates id=0 (score becomes 5)
  4. T4: Session C inserts new row (1,1,5)

The binlog output:

UPDATE user_data SET grade=5 WHERE id=0;
UPDATE user_data SET score=5 WHERE id=0;
INSERT INTO user_data VALUES(1,1,5);
UPDATE user_data SET score=5 WHERE id=1;
UPDATE user_data SET grade=100 WHERE score=5;

When replayed on a replica or using binlog for recovery, rows become (0,5,100), (1,5,100), (5,5,100)—data inconsistency results.

The Gap Lock Solution

Row locks only lock existing rows. Insert operations modify the gaps between records. InnoDB introduces gap locks to address this limitation.

For table user_data with 6 rows, 7 gaps exist:

(-∞,0] (0,5] (5,10] (10,15] (15,20] (20,25] (25,+∞)

When executing SELECT * FROM user_data WHERE score=5 FOR UPDATE, InnoDB locks both the matched rows and surrounding gaps, preventing new inserts.

Gap Lock Behavior

Gap locks conflict with insert operations, not with other gap locks:

-- Session A
SELECT * FROM user_data WHERE id = 9 FOR UPDATE;

-- Session B (non-blocking, targets same gap)
SELECT * FROM user_data WHERE id = 7 FOR UPDATE;

-- Session C (blocked by gap lock)
INSERT INTO user_data VALUES(9,9,9);

Next-Key Locks

Gap locks combined with row locks form next-key locks, using a half-open interval notation: (a, b] represents greater than a but less than or equal to b.

Deadlock Scenario

Gap locks can cause unexpected deadlocks:

  1. Session A: SELECT WHERE id=9 FOR UPDATE (non-existent row, acquires gap lock)
  2. Session B: SELECT WHERE id=7 FOR UPDATE (also acquires same gap lock—allowed)
  3. Session B: INSERT (9,9,9) → blocked by Session A's gap lock
  4. Session A: INSERT (9,9,9) → blocked by Sesssion B's gap lock

Both sessions wait indefinitely until InnoDB's deadlock detection aborts one transaction.

Isolation Level Considerations

Gap locks only exist in REPEATABLE READ. Switching to READ COMMITTED eliminates gap locks:

SET transaction_isolation = 'READ-COMMITTED';

When using READ COMMITTED, configure binlog format to ROW to prevent data inconsistency:

SET binlog_format = 'ROW';

READ COMMITTED offers better concurrency since locks affect only matched rows, not gaps. Choose based on whether your application truly requires REPEATABLE READ guarantees.

Tags: MySQL InnoDB database Phantom Read Gap Lock

Posted on Sun, 10 May 2026 07:15:14 +0000 by DangerousDave86