Ensuring Concurrency Safety in Cache Implementation
The second iteration phase focuses on guaranteieng thread safety for our cache implemantation while developing a core Group srtucture. This Group concept can be likened to a table in MySQL, providing namespace isolation for cached data.
A critical feature we implement is a fallback mechanism. When a value isn't found in the cache, the system should invoke a user-defined method to retrieve the data. For example, if the cache miss occurs, we can query a MySQL table, where the cache Group corresponds directly to that table.
Proper locking is essential in concurrent cache implementations. Consider the MainCache structure below - what issues could arise if its Add method lacks synchronization?
Thread-Safe Cache Implementation
func (cache *ConcurrentCache) Store(key string, data ValueContainer) {
cache.accessMutex.Lock()
defer cache.accessMutex.Unlock()
if cache.internalCache == nil {
cache.internalCache = lru.New(cache.maxSize, nil)
}
cache.internalCache.Store(key, data)
}
When multiple threads operate on shared variables with interdependent operations, thread safety issues can emerge due to thread scheduling.
Multi-Layer Locking Strategy
Why do we implement locks at both the Group and MainCache levels? Is this double-layer locking necessary? Yes, it serves a critical purpose:
func CreateGroup(groupName string, maxSize int64, dataFetcher Fetcher) *CacheGroup {
if dataFetcher == nil {
panic("data fetcher cannot be nil")
}
globalGroupMap.Lock()
defer globalGroupMap.Unlock()
group := &CacheGroup{
name: groupName,
dataFetcher: dataFetcher,
mainCache: ConcurrentCache{maxSize: maxSize},
}
// The lock ensures atomicity of these operations
globalGroupMap[groupName] = group
return group
}
// Concurrent-safe group retrieval
func RetrieveGroup(groupName string) *CacheGroup {
globalGroupMap.RLock()
group := globalGroupMap[groupName]
globalGroupMap.RUnlock()
return group
}
// Retrieve value from cache group
func (g *CacheGroup) Retrieve(key string) (ValueContainer, error) {
if key == "" {
return ValueContainer{}, errors.New("empty key not allowed")
}
// Check in main cache first
if value, exists := g.mainCache.Retrieve(key); exists {
log.Printf("[CacheSystem] cache hit for key: %s", key)
return value, nil
}
// Cache miss - load from data source
return g.populateData(key)
}
// Load data into cache
func (g *CacheGroup) populateData(key string, value ValueContainer) {
g.mainCache.Store(key, value)
}
The write lock in the CreateGroup function ensures atomicity between creating a new group and adding it to the global map. When retrieving groups, we allow multiple concurrent read operations using a read lock.
However, when multiple requests query the same key that hasn't been cached yet, they'll all trigger the fallback method to load data from the database (or other data source). These concurrent load operations would then attempt to add data to the MainCache simultaneously, creating potential race conditions. Therefore, the MainCache layer requires its own synchronization mechanism.