Redis Core Knowledge for Java Backend Interviews

What is Redis

Redis is a high‑performance, in‑memory key‑value database that also supports optional data persistence. It is open‑source and written in C, widely used both as a cache and as a primary datastore for specialised scenarios.

Why Redis Is So Fast

  • In‑memory storage – data is served directly from RAM, avoiding disk I/O for most operations.
  • Efficient data structures – Redis uses internally optimised constructs such as Simple Dynamic Strings (SDS), hash tables (dictionaries), skip lists, doubly linked lists, and ziplists.
  • I/O multiplexing – a single thread handles many connections via an event‑driven mechanism (epoll/kqueue), reducing overhead.
  • Single‑threaded execution – eliminates context switching and locking for many commands, enabling atomic operations.
  • Incremental rehashing and cached time – reduces latency spikes during growth phases.

When to Use Caching

Data that changes infrequently but is queried frequently is ideal for caching. This approach dramatically lowers DB load and improves response times.

Main reasons to adopt a cache:

  • Performance – cache lookups are near‑microsecond vs millisecond‑level DB queries.
  • Concurrency – Redis can handle tens of thousands of QPS, far surpassing a single relational database instance.

Key and Value Size Limits

  • A key can be up to 512 MB, though keeping it under 1 KB is recommended to save space and speed up lookups.
  • A single value can also be up to 512 MB. For strings this is the entire payload; for collections every element respects the same limit.

Redis vs. Memcached

  • Data structures – Redis offers rich types (lists, sets, hashes, sorted sets, streams, bitmaps), while Memcached is largely a plain key‑value store.
  • Native clustering – Redis provides built‑in sharding (Redis Cluster); Memcached relies on client‑side partitioning.
  • Performance – for small objects Redis often has an edge per core, whereas Memcached may be faster when dealing with largesized data (>100 KB).

Typical Use Cases

  • Caching
  • Leaderboards and rankings
  • Counters and rate limiting
  • Distributed session management
  • Distributed locks
  • Social graphs (follows, mutual friends, feeds)
  • Message queues
  • Bit operations (user sign‑ins, online status)

Data Types and Thier Scenarios

  1. String – simplest key‑value. Use for caching, counters, and simple distributed locks.

    SET page:views 5120
    INCR page:views
    
  2. Hash – collection of field‑value pairs. Excellent for representing objects such as user profiles.

    HSET user:1002 name "bob" age 25 email "bob@example.com"
    HGET user:1002 name
    
  3. List – ordered, used as a queue or stack. Enables simple message queues or timeline pagination.

    LPUSH task:queue "taskA"
    RPOP task:queue
    LRANGE task:queue 0 -1
    
  4. Set – unordered unique items, supports intersection, union, difference. Good for tagging, common contacts.

    SADD post:tags "java" "redis"
    SINTER post:tags other:tags
    
  5. Sorted Set – each element has a score that orders the set. Perfect for rankings.

    ZADD leaderboard 3200 "userX" 2800 "userY"
    ZRANGE leaderboard 0 -1 WITHSCORES
    

Underlying Data Structures

Redis does not expose its internal structures directly; it implements an object system built on top of:

  • SDS (Simple Dynamic String)
  • Linked lists
  • Hash tables (dictionaries)
  • Skip lists
  • Intsets (integer sets)
  • Ziplists (compressed lists)

Each key‑value pair is wrapped in a redisObject that tracks its logical type and encoding, allowing12 optimal use of memory.

Persistence Mechanisms

  1. RDB – takes point‑in‑time snapshots and writes them as compressed binary files. Triggers include BGSAVE (asynchronous), SAVE (synchronous), or auto‑conditions in redis.conf. Quick restores but may lose recent data.
  2. AOF – logs every write operation. Appended to a buffer and flushed to disk according to appendfsync: always, everysec, or no.alter AOF rewrite compactifies the log.
  3. Mixed persistence (Redis 4.0+) – during a rewrite the new AOF starts with an RDB snapshot followed by an incremental AOF tail, combining fast recovery with durability.
Aspect RDB AOF
Data safety Lower (interval snapshots) High (near real‑time)
Recovery speed Fast (binary loading) Slower (command replay)
alen File size Typically smaller Larger unless rewritten

High Availability

Sentinel Model

Redis Sentinel provides monitoring, notification, and automatic failover. Multiple Sentinel processes observe master and replicas, and they also monitor each other.

Failover procedure:

  1. A sentinel marks the master as subjectively down when PING replies fail for a configured interval.
  2. When enough sentinels agree (quorum), the master is declared objectively down.
  3. A leader sentinel is elected via a Raft‑like consensus.
  4. The leader chooses a new master from replicas – selecting by priority, replication offset, and run ID – and executes SLAVEOF NO ONE on it, then reconfigures remaining replicas.

Sentinel solves high availability but does not distribute write load (single‑write master).

Redis Cluster

Redis Cluster provides a sharded, decentralised architecture. The keyspace is split into 16,384 hash slots. Each key’s slot is computed as CRC16(key) % 16384.1 Nodes own slot ranges and can have replicas. When a master fails, one of its replicas is promoted. A typical layout uses 3 master + 3slave nodes.1 Cluster uses a gossip protocol for state exchange and can serve requests via any node (redirection is handled).

Transactions

  • MULTI opens a transaction block.1
  • Commands are queued.1
  • EXEC runs all queued commands atomically (no other client commands interleaved).1
  • DISCARD flushes the queue.1

Redis transactions do not support rollback;alen they guarantee sequential execution, not full ACID.

Expiry and Memory Eviction

Key Expiry Policies

  • Periodic deletion – every 100 ms Redis scans a random subset of keys with TTL and removes expired ones.
  • Lazy deletion – when a key is accessed, its expiration is checked and it is removed if stale.

Memory Eviction

When the memory limit is reached,alumn one of these policies applies:

  • noeviction: reject writes with errors.
  • allkeys-lru: evict least recently used keys among all keys.
  • allkeys-random: randomly pick a key.
  • volatile-lru: LRU among keys that have a TTL.
  • volatile-random: random among TTL‑bearing keys.
  • volatile-ttl: evict the key with the shortest remaining TTL.

Bloom Filter

A Bloom filter is a probabilistic space‑efficeint structure that can tell if an element is definitely absant or possibly present. It uses a large bit array and multiple hash functions.

Example using RedisBloom module:

BF.RESERVE bloom:userids 0.02 200000
BF.ADD bloom:userids "user:9876"
BF.EXISTS bloom:userids "user:9876"    # returns 1 (could be false positive)
BF.EXISTS bloom:userids "user:unknown" # returns 0 (definitely absent)

alenLimitation:alen it does not support deletion and has a tunable false‑positive rate.

Keeping Database and Cache Consistent

  1. Delayed double deletion – delete cache -> update DB -> wait (for replication lag) -> delete cache again.
  2. Canal + binlog – use Canal to capture MySQL binlog events and push16 updated data into Redis asynchronously.
  3. RocketMQ‑based eventual consistency – after a DB update, send a reliable message to queue; consumer updates Redis with retry mechanisms.

Distributed Lock Implementation

Acquire lock with16 SET resource_id unique_token NX PX 30000.16 The NX flag ensures only one client holds it.16 To release, validate ownership with a Lua script:

if redis.call("GET", KEYS[1]) == ARGV[1] then
    return redis.call("DEL", KEYS[1])
else
    return 0
end

In cluster environments, the Redlock algorithm attempts to acquire the lock on a majority of16 independent Redis nodes within a time bound, providing a more robust solution.

Cache Penetration, Breakdown, and Avalanche

Cache Penetration

Querying data that never exists in either the cache or the database. Mitigations:

  • Input validation
  • Cache empty results with short TTL
  • Bloom filter to pre‑reject non‑existent keys

Cache Breakdown

A hot key expires and many clients simultaneously try to rebuild it, overloading the backend. Solutions:

  • Set the key to never expire
  • Use a mutual‑exclusion lock so only one thread updates the cache

Cache Avalanche

Many keys expire at the same moment, or the cache service itself fails, sending all traffic to the database. Countermeasures: -deploy Redis in high‑availability mode (cluster + replicas)

  • Add16 random jitter to TTLs
  • Multi‑level caching (local in‑memory + network cache)
  • Circuit breakers and rate limiting

Redis as a Message Queue

  • List + blocking pop: LPUSH and BRPOP for a simple FIFO queue.
  • Sorted Set:alen use score as a scheduled time for delayed messages.
  • Pub/Sub: real‑time push to subscribers, but no replay for offline clients.
  • Streams (Redis 5.0+): a persistent, ordered log with consumer groups, resembling Kafka semantics.

Big Keys and Their Impact

A big key can be16 a string >≈10 KB, or a collection with a very large number of elements. Problems include:

  • Unbalanced memory across cluster nodes
  • Blocking commands (single‑threaded) that degrade latency
  • Increased network bandwidth when fetching the key

Handling Hot Keys

A hot key receives a disproportionately high volume of requests.

Detection:

  • Client‑side: maintain a key‑access counter.
  • Proxy‑side: aggregate metrics inTwemproxy/Codis.
  • Server‑side: the MONITOR command (produces heavy output, use carefully).

Mitigation:

  • Shard the hot key by appending a suffix and write to multiple Redis nodes;16 clients16 use consistent hashing to pick a replica.
  • Introduce a local in‑memory cache (like Caffeine) so most requests never reach Redis.

Cache Warm‑up

Pre‑load16 frequently accessed data into Redis before exposing the system to16 production traffic:

  • Provide a manual admin endpoint that triggers warm‑up.
  • Use a startup loader if the dataset is small.
  • Schedule periodic refresh tasks to keep16 cache16 populated.

Scanning Keys at Scale

With 100 million keys, using KEYS will block Redis for seconds and disrupt online traffic because of its single‑threaded nature. Instead, use SCAN:

SCAN 0 MATCH "prefix:*" COUNT 1000

SCAN is non‑blocking and returns a cursor. It may yield duplicate keys; perform client‑side deduplication. It runs slower overall but does nota halt16 service.

Tags: Redis Caching Interview Data Structures High Availability

Posted on Sat, 16 May 2026 00:09:43 +0000 by dave420