Evolution of Caching Technology
Hardware Origins
The term "cache" entered computing terminology from a 1967 electronics engineering paper, where the French word was adapted to mean "safekeeping storage." Early computer systems lacked caching mechanisms, with CPUs directly accessing main memory.
Key developments in CPU cache architecture:
- Intel 80386 introduced optional cache support with motherboard-mounted 64KB-128KB write-through cache
- Intel 80486 integrated 8KB L1 unified cache and introduced write-back cache attributes
- Pentium processors separated L1 cache into code and data segments
- Later generations progressively integrated L2 cache into CPU packages and eventually CPU dies
- Modern multi-core processors share L2 cache across cores
Cache operates on the principle of locality, where recently accessed data and instructions are likely to be accessed again. Modern processors employ sophisticated prediction and prefetching algorithms to anticipate data needs.
Concept Expansion
Today, caching exists at multiple system levels: between CPU and memory, memory and disk (disk cache), and even between local storage and network resources. Fundamentally, any structure that bridges speed disparities between hardware components qualifies as cache.
Cache Characteristics
Key Metrics
- Hit Rate: Ratio of successful cache retrievals to total requests
- Throughput: Operations per second (OPS), measuring concurrent read/write efficiency
- Eviction Policies:
- FIFO (First-In-First-Out)
- LFU (Least Frequently Used)
- LRU (Least Recently Used)
Cache Suitability Considerations
- Data consistency requirements between cache and underlying storage
- Expected cache efficiency and hit rate
- Appropriate time-to-live (TTL) settings
- Data structure compatibility with caching
- Computation before caching vs. caching before computation
Cache Stampede
Occurs when multiple processes simultaneously experience cache misses for the same key, causing parallel database queries. Mitigation strategies include cache warming and caching empty results.
Cache Classification
- Local Cache: In-process caching with minimal latency
- Distributed Cache: External caching service supporting multiple applications
Distributed Cache Requirements
Distributed systems necessitate shared caching to maintain consistency across application instances. Single-node caches face limitations in capacity and create synchronization challenges in distributed environmetns.
Distributed Cache Design Considerations
Persistence
Support for data durability through asynchronous disk writes with configurable persistence strategies.
Memory Management
Implementation of eviction algorithms (FIFO, LFU, LRU) when memory limits are reached.
Communication Protocol
Design of efficient serialization protocols like Redis's RESP (Redis Serialization Protocol) that minimize data transfer while maximizing information density.
Scalability
Support for horizontal scaling through clustering rather than vertical scaling limitations.
High Availability
Implementation of monitoring, failover mechanisms, and replica synchronization.
Concurrency Support
Handling concurrent operations through queuing or locking mechanisms.
Common Distributed Cache Challenges
Consistency Issues
Strategies for maintaining cache-database synchronization:
- Strong consistency (difficult to achieve)
- Weak consistency
- Eventual consistency (preferred for distributed systems)
Update strategies include cache-then-database, database-then-cache, and dual-deletion approaches with compensation mechanisms.
Cache Penetration
Occurs when non-existent data receives high request volumes. Solutions include:
- Business-level validation filters
- Bloom filters for existence checking
- Caching empty results with short TTL
Cache Avalanche
Mass simultaneous cache expirations causing database overload. Prevention methods:
- Cache warming for hot data
- Randomized expiration times
- Data sharding across nodes
- Dual-cache architectures
Example implementation:
redis.set(productId, productData, 3600 + Math.random() * 1000);
Cache Breakdown
Sudden expiration of popular data causing concentrated database load. Solutions include:
- Permanent caching for hot data
- Mutex locks for cache regeneration
Mutex implementation example:
public String fetchProductData(String productId) {
String data = redis.get(productId);
if (data == null) {
if (redis.setnx(lockKey, "1", 60)) {
try {
data = database.fetch(productId);
redis.set(productId, data, 86400);
} finally {
redis.del(lockKey);
}
} else {
Thread.sleep(200);
return fetchProductData(productId);
}
}
return data;
}
Hotspot Caching
Concentrated traffic on specific keys. Mitigation through key sharding and real-time hotspot detection.
Large Key Issues
Oversized values consuming excessive bandwidth. Identification through network monitoring and solutions including data splitting and architectural review.
Large key thresholds:
- String values exceeding 10KB
- Collection types with over 5,000 elements or 10MB memory usage