Question 1

What is the difference between LRU and LFU cache eviction?

Accepted Answer

LRU (Least Recently Used): evicts the item that was accessed least recently. Assumption: items accessed recently are more likely to be accessed again (temporal locality). Implementation: doubly-linked list + hashmap. O(1) get and put. Best for general-purpose web caches where recent access predicts future access. Weakness: a one-time scan of many unique items (e.g., a batch job reading millions of keys) pollutes the cache by evicting frequently-used items. LFU (Least Frequently Used): evicts the item accessed fewest times overall. Better when long-term frequency predicts future access (popular pages, hot product listings). Weakness: "cache pollution" from old items with high historical frequency but low recent usage — a viral item from last year stays cached. LFU is harder to implement in O(1): requires tracking frequency counters and a min-frequency bucket structure. In practice, LRU (or LRUK — LRU with K = 2 recency levels) is used by most production caches.

Question 2

What is cache-aside (lazy loading) and when should you use it?

Accepted Answer

Cache-aside: the application manages the cache. Read: check cache first; on miss, read from DB, write to cache, return result. Write: write to DB, then invalidate or update the cache key. Used by default in most web applications with Redis. Advantages: only requested data is cached (no cold data). Cache failure is non-fatal — requests fall through to DB. Works with any DB. Disadvantages: first request always misses (cold start). If cache is down, all requests hit DB. Small window of stale data between DB write and cache invalidation. Use cache-aside when: reads dominate writes, data can tolerate brief staleness, you want the simplest possible caching layer. Compare to read-through: similar, but the cache fetches from DB on miss (not the app). Read-through is better when you want to centralize the cache-filling logic and avoid the app knowing about the DB directly.

Question 3

How does consistent hashing work in a distributed Redis cache?

Accepted Answer

Consistent hashing places both cache nodes and keys on a virtual ring (hash space 0 to 2^32). Each key is stored on the first node clockwise from its hash position. When a node is added: only keys between the new node and its predecessor move — O(K/N) keys, where K is total keys and N is nodes. When a node is removed: only its keys move to the successor — minimal redistribution. Compare to modulo hashing (key % N): adding or removing a node remaps almost all keys (nearly 100% cache miss spike). Consistent hashing minimizes cache miss surge during cluster changes to O(1/N) of keys. Virtual nodes (vnodes): each physical node is represented by multiple virtual nodes on the ring (e.g., 150 vnodes per node). This improves load distribution when nodes have different capacities or when the ring has few physical nodes.

Question 4

What is a cache stampede and how do you prevent it?

Accepted Answer

Cache stampede (thundering herd): a popular cache key expires. Simultaneously, thousands of requests find the cache empty, all query the DB, and all try to write the result back to the cache. The DB is overwhelmed. Prevention strategies: (1) Mutex lock: only one request fetches from DB; others wait or return a slightly stale value. Redis SET NX (set if not exists) implements a distributed lock. Problem: all waiting requests add latency. (2) Probabilistic early expiration (PER): before the TTL expires, proactively re-fetch with probability p that increases as the key approaches expiry: p = exp(-beta * remaining_ttl / compute_time). Described in the "Optimal Probabilistic Cache Stampede Prevention" paper. (3) Background refresh: a cron job pre-fetches popular keys before they expire. Requires knowing which keys are popular. (4) Staggered TTLs: add ±10% random jitter to all TTLs, preventing mass simultaneous expiration. Combination of staggered TTLs + probabilistic early expiration is most robust.

Question 5

What are the trade-offs between write-through and write-behind caching?

Accepted Answer

Write-through: every write goes to both cache and DB synchronously. The cache is always consistent with the DB. Write latency = DB write latency (slow). Read latency = cache latency (fast). No data loss on cache failure (DB has everything). Good for: data where consistency is critical and writes are infrequent (user settings, account balance). Write-behind (write-back): writes go to cache first; DB is updated asynchronously. Write latency = cache write latency (very fast). Read latency = cache latency. Risk: if the cache crashes before the async DB write completes, data is lost. Good for: high-write workloads where some loss is acceptable (analytics counters, game scores, click tracking). Hybrid: write-through for critical data (payments, auth), write-behind for analytics. Note: write-through doubles write traffic (every DB write also hits the cache). For write-heavy workloads, this overhead may outweigh the consistency benefit.

Cache System Low-Level Design (LRU, Write Policies, Redis Cluster)

Why Caching?

Cache Eviction Policies

LRU Cache Implementation (LC 146)

Cache Write Policies

Cache Invalidation Strategies

Distributed Cache Architecture

Cache Stampede (Thundering Herd)

Key Design Decisions