Concurrency Strategy in B+Trees
Implementing thread safety in a B+Tree index requires protecting both the internal data of nodes and the structural integrity during split and merge operations. The standard protocol for this is Latch Crabbing, where a thread acquires a latch on a child node and only releases the latch on the parent if the child is deemed "safe."
A node is considered safe if an operation will not trigger a structural change that propagates up to the parent:
- Search: All nodes are always safe as no structural changes occur.
- Insert: A node is safe if it has space for at least one more element without reaching
max_size. - Delete: A node is safe if its size is strictly greater than the
min_sizerequirement.
Pessimistic vs. Optimistic Approaches
Basic Latch Crabbing (Pessimistic)
For write operations (Insert/Delete), the pesssimistic approach assumes structural changes will occur. It starts by acquiring a Write (W) latch on the root. As it traverses down, it acquires a W latch on the child. If the child is safe, the thread releases the latches on all ancestors; otherwise, it keeps the latches held.
Optimistic Latch Crabbing (Improved)
Since most operations do not result in splits or merges, starting with a W latch on the root creates a performance bottleneck. The optimistic approach assumes no structural changes will happen initially:
- Perform a Search-like traversal using Read (R) latches down to the leaf.
- Acquire a W latch on the target leaf.
- If the leaf is safe, complete the operation.
- If the leaf is unsafe (full or underflow), abort, release all latches, and restart using the pessimistic strategy.
Transaction Management and Resource Tracking
To manage latches across multiple levels, the system utilizes a Transaction object. This object maintains a page_set_ (typically a std::deque<Page*>) to track the path taken from the root. This allows the implementation to release latches in the correct order (top-down) once a safe node is encountered or the operation completes.
enum class LockType { READ, INSERT, DELETE };
template <typename KeyType, typename ValueType, typename KeyComparator>
bool BPlusTree<KeyType, ValueType, KeyComparator>::CheckNodeSafety(BPlusTreePage *node, LockType mode) {
if (mode == LockType::READ) return true;
int size = node->GetSize();
int max = node->GetMaxSize();
if (mode == LockType::INSERT) {
// Leaves split at max_size, internal nodes at max_size + 1 during insertion logic
return node->IsLeafPage() ? size < max - 1 : size < max;
}
if (mode == LockType::DELETE) {
if (node->IsRootPage()) {
return node->IsLeafPage() ? size > 1 : size > 2;
}
return size > node->GetMinSize();
}
return false;
}
The Search Traversal Logic
The core of the concurrency implementation resides in the leaf-finding logic. This function must handle the transition between latch types and interact with the transaction's page set.
Page *BPlusTree::FindLeafWithLatches(const KeyType &key, Transaction *txn, LockType mode, bool is_first_pass) {
if (!is_first_pass) {
root_lock_.WLock();
txn->AddIntoPageSet(nullptr); // Placeholder for root_lock_
} else {
root_lock_.RLock();
}
page_id_t current_id = root_page_id_;
Page *last_page = nullptr;
while (true) {
Page *page = buffer_pool_manager_->FetchPage(current_id);
auto *tree_node = reinterpret_cast<BPlusTreePage *>(page->GetData());
if (is_first_pass) {
if (tree_node->IsLeafPage() && mode != LockType::READ) {
page->WLatch();
} else {
page->RLatch();
}
// Release parent latch immediately in the first (optimistic) pass
if (last_page == nullptr) {
root_lock_.RUnlock();
} else {
last_page->RUnlatch();
buffer_pool_manager_->UnpinPage(last_page->GetPageId(), false);
}
} else {
// Pessimistic pass: keep latches until safety is confirmed
page->WLatch();
if (CheckNodeSafety(tree_node, mode)) {
ClearTransactionLatches(txn);
}
txn->AddIntoPageSet(page);
}
if (tree_node->IsLeafPage()) {
// Check if the optimistic pass failed
if (is_first_pass && mode != LockType::READ && !CheckNodeSafety(tree_node, mode)) {
page->WUnlatch();
buffer_pool_manager_->UnpinPage(page->GetPageId(), false);
return FindLeafWithLatches(key, txn, mode, false);
}
return page;
}
auto *internal = reinterpret_cast<InternalPage *>(tree_node);
current_id = internal->Lookup(key, comparator_);
last_page = page;
}
}
Handling Structural Changes
When a leaf node overflows or underflows, the algorithm recurses upward. In a concurrent environment, it is vital to avoid re-fetching pages from the Buffer Pool Manager to prevent deadlocks or unnecessary overhead. Instead, parent pages are retrieved directly from the transaction's page_set_.
Deletion and Sibling Latches
During a Delete operation, merging or borrowing requires accessing a sibling node. To prevent data races, the sibling must be latched. Unlike ancestors, sibling latches are temporary and should be released immediately after the redistribution or merge is completed. Sibling locks do not need to be stored in the transaction's main page set.
Preventing Deadlocks
Deadlocks are generally avoided in Latch Crabbing because threads acquire latches in a strict top-down order. However, index iterators that traverse leaf nodes horizontally (left-to-right) can conflict with structural changes that may lock multiple nodes. To mitigate this, iterators should use a non-blocking "try-latch" mechanism or strictly follow a protocol that releases resources if a latch cannot be immediately acquired.
Cleanup and Buffer Management
Once an operation concludes, all held latches must be released, and pages must be unpinnned. For Delete operations, pages that were merged and are no longer part of the tree should be added to a deleted_page_set within the transaction and removed from the buffer pool only after all latches are released to ensure no other thread is referencing the memory.