Undo Log Version Chain Rollback Under MySQL High Concurrency: Deep Dive into Row Data Recovery

In the scenario of MySQL InnoDB high-concurrency writes to the same row, the Undo Log version chain is the core mechanism ensuring transaction atomicity and implementing MVCC. When a transaction in the version chain rolls back, InnoDB doesn't simply "delete" that transaction's version record. Instead, it traverses the version chain in reverse using the roll pointer, restoring the data to the "base version" before that transaction executed. At the same time, the rollback operation affects the structure of the version chain but does not break the version visibility for other transactions. This article dissects the complete details of "transaction rollback → version chain backtracking → data recovery" in the context of high-concurrency writes to the same row.

1. Foundation: Structure of the Undo Log Version Chain (High-Concurrency Write Scenario)

1.1 Core Components of the Version Chain

InnoDB maintains 3 hidden columns for each row of data, which form the basis of the version chain:

Hidden Column Purpose
trx_id Transaction ID that last modified the row (incremental, uniquely identifies the transaction)
roll_pointer Rollback pointer, points to the Undo Log record corresponding to the current version (forming the version chain)
db_row_id Row ID (auto-generated when there is no primary key/unique index; not the focus of this article)

1.2 Example: Version Chain Generation with High-Concurrency Writes to the Same Row

Assume an initial row: id=1, stock=100 (initial version trx_id=0, roll_pointer=null). Under high concurrency, three transactions modify this row sequentially, generating the version chain:

-- Transaction 1 (T1, trx_id=1001): stock=100 → 99
BEGIN; UPDATE goods SET stock=99 WHERE id=1; -- Uncommitted
-- Transaction 2 (T2, trx_id=1002): waits for T1's lock to be released, then stock=99 → 98
BEGIN; UPDATE goods SET stock=98 WHERE id=1; -- Uncommitted
-- Transaction 3 (T3, trx_id=1003): waits for T2's lock to be released, then stock=98 → 97
BEGIN; UPDATE goods SET stock=97 WHERE id=1; -- Uncommitted

Now the version chain structure (from newest to oldest):

T3 version (stock=97, trx_id=1003, roll_pointer → T2 Undo Log)
↑
T2 version (stock=98, trx_id=1002, roll_pointer → T1 Undo Log)
↑
T1 version (stock=99, trx_id=1001, roll_pointer → Initial Undo Log)
↑
Initial version (stock=100, trx_id=0, roll_pointer=null)

Key point: Each transaction, when modifying data, first writes the Undo Log (recording the "reverse operation"), then updates the row's trx_id and roll_pointer. The version chain always points from newest to oldest historical versions.

2. Core Scenario: Operational Details When a Transaction in the Middle of the Version Chain Rolls Back

Using the scenario above, assume Transaction 2 (T2) performs a ROLLBACK before committing (at this time, T1 and T3 are uncommitted). Let's break down the complete steps:

Scenario Prerequisites

  • T1: holds lock → modifies stock=99 → uncommitted → holds lock
  • T2: waits for T1's lock to be released → acquires lock → modifies stock=98 → uncommitted → holds lock
  • T3: waits for T2's lock to be released → hasn't acquired lock → waiting
  • At this point: T2 → ROLLBACK;

Step 1: Trigger Rollback, Locate the Current Transaction's Undo Log Record

  1. After T2 executes ROLLBACK, InnoDB first uses T2's trx_id=1002 to find the Undo Log record corresponding to T2's version for this row.
  2. T2's Undo Log is a logical log, containing:
Table=goods, Row=id=1, Operation type=UPDATE, Original value=stock=99, New value=stock=98, trx_id=1002, roll_pointer→T1 Undo Log

Note: UPDATE operations generate UPDATE_UNDO logs (INSERT generates INSERT_UNDO, DELETE generates DELETE_UNDO).

Step 2: Backtrack the Version Chain in Reverse, Restore Data to the "Rollback Base Version"

T2's rollback goal is "to undo its own modification, restoring the data to the state before T2 executed", i.e., T1's version (stock=99). Core operations:

  1. Read the original value from the Undo Log: Extract the pre-modification original value stock=99 from T2's Undo Log.
  2. Restore the in-memory data page: Change the stock value in the Buffer Pool data page for id=1 from 98 back to 99.
  3. Reset the row's hidden columns:
    • Change the row's trx_id from 1002 back to 1001 (restore to T1's transaction ID).
    • Change the row's roll_pointer from "pointing to T2's Undo Log" back to "pointing to T1's Undo Log".
  4. Mark T2's Undo Log as "purgeable": T2's Undo Log is no longer linked to the row's version chain. It will be asynchronously cleaned up by the purge thread latter (after no reading transactions reference it).

Step 3: Release Locks, Restore Version Chain Visibility

  1. After T2's rollback completes, release the held row lock (primary key index lock for id=1).
  2. T3, which was waiting for the lock, is awakened. Upon acquiring the lock, it continues (at this point, T3 reads T1's version stock=99, not T2's 98).
  3. The version chain structure is updated (T2's version is "detached"):
T1 version (stock=99, trx_id=1001, roll_pointer → Initial Undo Log)
↑
Initial version (stock=100, trx_id=0, roll_pointer=null)

Step 4: Special Case: If T2 Has Already Committed and Then Needs to Roll Back (via Flashback / Manual Recovery)

If T2 has already committed (T2 is a "committed version" in the chain), it cannot be rolled back using ROLLBACK (committed transactions are not rollbackable). Recovery must be done manually using the Undo Log version chain. Steps:

  1. Find the Undo Log record corresponding to T2's trx_id=1002, extract the original value stock=99.
  2. Execute UPDATE goods SET stock=99 WHERE id=1 (generates a new transaction T4 with trx_id=1004).
  3. New version chain structure:
T4 version (stock=99, trx_id=1004, roll_pointer → T2 Undo Log)
↑
T2 version (stock=98, trx_id=1002, roll_pointer → T1 Undo Log)
↑
T1 version (stock=99, trx_id=1001, roll_pointer → Initial Undo Log)
↑
Initial version (stock=100, trx_id=0, roll_pointer=null)

3. Detailed Version Chain Operations for Different Rollback Scenarios (Full Coverage)

Scenario 1: Newest Transaction Rolls Back (T3 rolls back, no subsequent transactions)

  • Scenario: T1→T2→T3 all uncommitted, T3 executes rollback.
  • Rollback base version: T2's version (stock=98).
  • Operational details:
    1. Extract original value 98 from T3's Undo Log, restore row stock=98.
    2. Change row's trx_id back to 1002, set roll_pointer to point to T2's Undo Log.
    3. Mark T3's Undo Log as purgeable; version chain reverts to T2's version.

Scenario 2: Initial Transaction Rolls Back (T1 rolls back, no subsequent transactions executed)

  • Scenario: T1 modified but uncommitted, T2/T3 waiting for locks, T1 executes rollback.
  • Rollback base version: Initial version (stock=100).
  • Operational details:
    1. Extract original value 100 from T1's Undo Log, restore row stock=100.
    2. Change row's trx_id back to 0 (initial version), set roll_pointer to null.
    3. Mark T1's Undo Log as purgeable; version chain reverts to initial state.
    4. T2/T3 are awakened and perform modifications based on the initial version (100).

Scenario 3: The Rolled-Back Transaction Is Still Referenced by Another Transaction (MVCC Visibility)

  • Scenario: T2 has committed, T4 (a reading transaction) reads T2's version (stock=98) via MVCC. T2 cannot be rolled back using ROLLBACK (since it's committed), but its Undo Log won't be purged.
  • Core rules:
    1. As long as a reading transaction (e.g., T4) has a Read View that includes T2's trx_id=1002, T2's Undo Log will not be cleaned by the purge thread.
    2. To recover the data, a new transaction must be generated to overwrite it (e.g., the UPDATE operation in T4 above). The original version chain retains T2's record.

4. Core Rules and Underlying Guarantees of Rollback

4.1 The "Base Version" Rule for Rollback (Core)

Transaction Type Rollback Base Version (Which Undo Log Version to Restore To) Example
Uncommitted transaction Previous version before its execution (original value version recorded in its Undo Log) T2 rolls back → restore to T1 version
Committed transaction No direct rollback; requires a new transaction to overwrite (based on the original value from Undo Log) T2 committed → T4 modifies back to T1 version
Initial modifying transaction Table's initial version (version with roll_pointer=null) T1 rolls back → restore to stock=100

4.2 Rollback Guarentee Mechanisms Under High Concurrency

  • Lock Mutual Exclusivity: During rollback execution, the transaction still holds the row lock, preventing other transactions from concurrently modifying the row, thus ensuring rollback atomicity.
  • Undo Log Persistence: Undo Logs are stored in the InnoDB tablespace (shared/dedicated undo tablespaces). Even if the database crashes during rollback, the Undo Logs can be recovered via Redo Log after restart, allowing the rollback to complete.
  • Version Chain Integrity: Rollback only "detaches" the current transaction's version; it does not delete historical Undo Logs (unless no reading transaction references them), ensuring that MVCC reads of other transactions are unaffected.

4.3 Undo Log Cleanup Rules

  • Uncommitted transaction rollback: The Undo Log is marked as "purgeable". The purge thread cleans it up as soon as no reading transactions reference it.
  • Committed transaction: The Undo Log is retained until all Read Views that include that trx_id become invalid (i.e., reading transactions end), after which the purge thread cleans it up asynchronously.
  • Crash Recovery: If a crash occurs during rollback, InnoDB traverses the Redo Log upon restart, finds the incomplete rollback transaction, and continues the rollback.

5. Practical Pitfalls: Considerations for Rollback Under High Concurrency

  1. Avoid Long Transactions Holding Locks: The duration of a rollback operation is proportional to the length of the version chain (longer chains mean more Undo Log records to backtrack). Long transactions can increase rollback time and may even cause lock wait timeouts.
  2. Monitor Undo Log Tablespace: High-concurrency rollbacks can generate a large number of Undo Logs. Enable dedicated undo tablespaces (innodb_undo_tablespaces) to avoid the shared tablespace from bloating.
  3. Be Cautious with Manual Rollback of Committed Transactions: Committed transactions cannot be rolled back via ROLLBACK. When manually recovering data with an UPDATE, be aware of concurrency (use locking reads like SELECT ... FOR UPDATE).
  4. Impact of Isolation Levels on Rollback:
    • RR (Repeatable Read): The Read View is created at the first read in the transaction. After a rollback, snapshot reads of other transactions still see the data as if the rolled-back modification never existed.
    • RC (Read Committed): A new Read View is created for each read. After a rollback, snapshot reads of other transactions will immediately see the restored data.

6. Summary: Recap of Key Details

  1. Rollback Base Version: For an uncommitted transaction, rollback restores to the previous version before its execution using the "original value" in its Undo Log (e.g., T2 rolls back to T1's version). For a committed transaction, there's no direct rollback; a new transaction must be created to overwrite the data.
  2. Version Chain Operation: Rollback resets the row's trx_id and roll_pointer, detaches the current transaction's version record, marks its Undo Log as purgeable, but does not delete historical versions.
  3. High-Concurrency Guarantees: Row locks are held during rollback; Undo Logs are persisted; the MVCC mechanism ensures read transaction version visibility; the purge thread asynchronously cleans Unreferenced Undo Logs.

Understanding the details of Undo Log version chain rollback helps pinpoint data consistency issues under high concurrency (e.g., abnormal data after rollback, lock wait timeouts) and is a core knowledge area for MySQL high-concurrency optimization.

Tags: MySQL InnoDB Undo Log Version Chain Rollback

Posted on Sat, 09 May 2026 14:37:02 +0000 by ziegel