VLL: A Redesign of Lock Manager for In-Memory Database Systems

In database systems, locks serve as a fundamental concurrency control mechanism. However, for in-memory database systems, the lock manager often becomes a performance bottleneck.

Research indicates that 16-25% of time in memory databases is spent interacting with the lock manager

Traditional lock managers typically implement a hash table where data primary keys serve as IDs and linked lists of lock requests for that data serve as values. These linked lists track lock status and support mutual exclusion operations.

For disk-based databases, hash table lookups, lock acquisition, and linked list operations are negligible. But for in-memory databases, these additional memory accesses, cache misses, CPU cycles, and critical sections can approach or exceed the cost of executing actual transaction logic.

VLL introduces two major changes to the traditional lock manager:

  • Moving lock information from central hash tables to the raw data itself
  • Removing information about which transactions have pending requests for specific locks, replacing it with counters (CX for pending write requests, CS for pending read requests) maintained in the raw data

When transactions acquire locks, they request all locks at once and are ordered globally based on the sequence of lock requests. This global transaction ordering determines which transaction should be unblocked and executed first.

However, VLL is not a silver bullet. Under high contention workloads, it may lead to reduced concurrency and poor CPU utilization. To address this, Selective Contention Analysis (SCA) was introduced to efficiently compute the most useful subset of contention information when needed.

VLL Algorithm Overview

Lock Structure

The VLL lock management structure consists of:

  • Counters for pending write (CX) and read (CS) lock requests stored with each raw data item
  • A central global transaction request queue (TxnQueue)
Workflow

The process works as follows:

  1. When a transaction arrives at a partition, it attempts to lock all records it will access during its lifecycle

Each lock request increments the corresponding record's CX or CS counter 2. If after the request CX=1 and CS=0, the requesting transaction has acquired an exclusive lock

If CX=0, the requesting transaction has acquired a shared lock 3. After acquiring locks, the transaction is added to TxnQueue

Both lock requests and queue additions occur within the same critical section 4. When leaving the critical section, if the transaction acquired all requested locks, it's marked as free; otherwise, it's blocked. Only free transactions can execute immediately

Challenges

Several issues arise:

  1. In single-threaded VLL, waiting for remote node coordination can waste resources
  2. With out a central lock management structure, how to release blocked transactions?
  3. If locks can only be acquired when CX=0, but write requests always increment CX, CX will remain >0 when multiple write transactions are blocked
  4. As TxnQueue grows, the probability of new transactions acquiring locks immediately decreases

To address these:

  1. Introduce a "waiting" state for transactions that need remote results to continue, allowing them to suspend without resource waste
  2. Use background threads to check CX and CS values for blocked transactions in TxnQueue. When CX drops to 1 and CS to 0, the transaction has acquired a exclusive lock; when CX is 0 and CS > 0, it has acquired a shared lock
  3. Unblock and execute the first blocked transaction in the queue. Being at the queue head means all preceding transactions have completed
  4. Set a threshold for TxnQueue. When exceeded, stop processing new transactions and search for unblockable transactions. Within the threshold, prioritize new requests
Code and Process Demonstration

Here's the implementation:

func startTransaction(tx *Transaction) {
	// Initialize transaction status as Free
	tx.Status = FreeStatus

	// Check local read set for conflicts
	for _, key := range tx.ReadSet {
		if isLocal(key) {
			data[key].readCount++
			if data[key].writeCount > 1 {
				tx.Status = BlockedStatus
			}
		}
	}

	// Check local write set for conflicts
	for _, key := range tx.WriteSet {
		if isLocal(key) {
			data[key].writeCount++
			if data[key].readCount > 0 || data[key].writeCount > 1 {
				tx.Status = BlockedStatus
			}
		}
	}

	// For distributed transactions that are free, mark as waiting
	if tx.Type == DistributedTx && tx.Status == FreeStatus {
		tx.Status = WaitingStatus
	}
}

func commitTransaction(tx *Transaction) {
	// Release read locks
	for _, key := range tx.ReadSet {
		if isLocal(key) {
			data[key].readCount--
		}
	}

	// Release write locks
	for _, key := range tx.WriteSet {
		if isLocal(key) {
			data[key].writeCount--
		}
	}
}

func runVLL() {
	for {
		// Handle remote transaction messages
		tx, received, msg := selectRemoteMessage()
		if received {
			processReadResult(tx, msg)
			if canExecute(tx) {
				executeTransaction(tx)
				commitTransaction(tx)
				transactionQueue.remove(tx)
			}
			continue
		}

		// Handle blocked transactions at queue front
		if transactionQueue.peek().Status == BlockedStatus {
			tx = transactionQueue.peek()
			if tx.Type == DistributedTx {
				tx.Status = WaitingStatus
				// Send requests to other nodes
			} else {
				// Unblock and execute head transaction
				tx.Status = FreeStatus
				executeTransaction(tx)
				commitTransaction(tx)
				transactionQueue.remove(tx)
			}
		} else if !transactionQueue.isFull() {
			// Process new transaction requests when queue isn't full
			tx = getNewRequest()
			startTransaction(tx)
			switch tx.Status {
			case FreeStatus:
				executeTransaction(tx)
				commitTransaction(tx)
			case WaitingStatus:
				// Send requests to other nodes
				transactionQueue.add(tx)
			case BlockedStatus:
				transactionQueue.add(tx)
			}
		}
	}
}

Consider this scenario with the following transaction requests:

txn read set write set
A x, y x
B x, y x
C x
D y z
E y
  1. Initially, all CX and CS counters are 0, and TxnQueue is empty
  2. Transaction A requests to read x, y and write x. CX for x increments, CS for y increments, and A is added as a free transaction to TxnQueue
  3. Transaction B requests to read x, y and write x. Since x.CX > 0, B cannot acquire the write lock and is added as a blocked transaction to TxnQueue
  4. After A completes, it releases locks (x.CX--, y.CS--), is removed from TxnQueue, and B becomes free, acquiring locks to begin execution
  5. Transaction C requests to read x, incrementing x.CS. Since x.CX > 0, C cannot guarantee the read and is added as a blocked transaction
  6. Transaction D requests to read y and write z. y.CS++ and z.CX=1. Since y.CX and z.CX were 0, D acquires all locks and is added as a free transaction
  7. Transaction E requests to write y. Since y.CS > 0, E fails and is added as a blocked transaction
  8. After B completes, it releases locks and is removed. C, now at the queue front, becomes free and acquires the x read lock
  9. Finally, E can safely execute as it doesn't conflict with any preceding transactions. However, VLL only allows the queue head to execute, so E still cannot proceed. SCA removes this limitation
Array VLL

Array VLL uses vector or array data structures to store lock information contiguously. This separates data and lock information in to two separate requests, but in single-threaded environments, performance benefits from core caching.

Challenges with Acquiring All Locks at Once
  • Transaction read/write sets may be unknown before execution
  • With TxnQueue per node and local critical sections, different nodes may process in inconsistent orders, potentially causing deadlocks

Solutions include:

  • Allow transactions to perform necessary reads before entering the critical section
  • For deadlocks from inconsistent node ordering:
  • Allow deadlocks and use detection protocols to abort deadlocked transactions
  • Coordinate nodes for consistent transaction ordering

VLL uses node coordination, which improves performance despite added coordination overhead.

SCA - Improving Performance Under High Contention

For high contention and multi-partition workloads, VLL introduces SCA to simulate standard lock managers by detecting which transactions should inherit released locks.

SCA activates only when needed, such as during CPU idle time. It scans from queue head to tail, analyzing conflicts with preceding transactions.

func performSCA(queue TxnQueue) *Transaction {
	// Initialize conflict detection arrays
	writeConflicts := make([]int8, 100<<10)
	readConflicts := make([]int8, 100<<10)
	
	for _, item := range queue {
		// Check blocked transactions
		if item.Status == BlockedStatus {
			canProceed := true
			
			// Check read set for write conflicts
			for _, key := range item.ReadSet {
				hashed := hashKey(key)
				if writeConflicts[hashed] == 1 {
					canProceed = false
				}
				// Mark read set as occupied
				readConflicts[hashed] = 1
			}
			
			// Check write set for read/write conflicts
			for _, key := range item.WriteSet {
				hashed := hashKey(key)
				if readConflicts[hashed] == 1 || writeConflicts[hashed] == 1 {
					canProceed = false
				}
				// Mark write set as occupied
				writeConflicts[hashed] = 1
			}
			
			// Return unblocked transaction
			if canProceed {
				return item
			}
		} else {
			// For unblocked transactions, just mark occupied
			for _, key := range item.ReadSet {
				hashed := hashKey(key)
				readConflicts[hashed] = 1
			}
			
			for _, key := range item.WriteSet {
				hashed := hashKey(key)
				writeConflicts[hashed] = 1
			}
		}
	}
	
	return nil
}

VLLR - VLL for Range Locking

For frequent operations on many consecutive rows in a single transaction, range locking is more efficient than individual row locking. It prevents phantom reads and is particularly useful for deleting entities sharing common key prefixes.

VLLR locks range prefixes. When locking a range key, the range is represented as set R, generating set P where any element in R has a prefix in P. The simplest approach uses the longest common prefix of the minimum and maximum bit strings in R.

After obtaining P, VLLR uses hierarchical locking: intent lock at database level, then table, then page, finally row lock.

Additionally, VLLR introduces LX and LS counters for the P set on top of CX and CS to track pending write and read requests for prefix ranges.

func requestRangeLock(tx *Transaction, range *KeyRange, exclusive bool) bool {
	// Add transaction to queue
	transactionQueue.add(tx)

	// Get range prefixes
	prefixes := range.toPrefixes()

	// Request prefix locks
	success := true
	for _, prefix := range prefixes {
		success = success && requestPrefixLock(prefix, exclusive)
	}
	return success
}

func requestPrefixLock(key string, exclusive bool) bool {
	success := true
	hashed := hashKey(key)

	// Check current key lock availability
	if exclusive {
		success = success && (readCount[hashed] == 0)
	}
	success = success && (writeCount[hashed] == 0)

	// Check prefix lock availability
	if exclusive {
		success = success && (prefixReadCount[hashed] == 0)
	}
	success = success && (prefixWriteCount[hashed] == 0)

	// Check all prefix locks
	for i := 1; i < len(key); i++ {
		prefixHash := hashKey(key[:i])
		success = success && (writeCount[prefixHash] == 0)
		if exclusive {
			success = success && (readCount[prefixHash] == 0)
		}
	}

	// Mark current key lock
	if exclusive {
		writeCount[hashed]++
	} else {
		readCount[hashed]++
	}

	// Mark prefix locks
	for i := 1; i < len(key); i++ {
		prefixHash := hashKey(key[:i])
		if exclusive {
			prefixWriteCount[prefixHash]++
		} else {
			prefixReadCount[prefixHash]++
		}
	}

	return success
}

func releaseRangeLock(tx *Transaction, range *KeyRange, exclusive bool) {
	// Release prefix locks
	prefixes := range.toPrefixes()
	for _, prefix := range prefixes {
		releasePrefixLock(prefix, exclusive)
	}

	// Remove from queue
	transactionQueue.remove(tx)
}

func releasePrefixLock(key string, exclusive bool) {
	// Release current key lock
	hashed := hashKey(key)
	if exclusive {
		writeCount[hashed]--
	} else {
		readCount[hashed]--
	}

	// Release prefix locks
	for i := 1; i < len(key); i++ {
		prefixHash := hashKey(key[:i])
		if exclusive {
			prefixWriteCount[prefixHash]--
		} else {
			prefixReadCount[prefixHash]--
		}
	}
}

References

  1. https://www.cs.umd.edu/~abadi/papers/vldbj-vll.pdf

Tags: in-memory databases lock management Concurrency Control VLL SCA

Posted on Fri, 26 Jun 2026 17:37:55 +0000 by pbsperry