Caching is the first answer to scaling — but designing a correct, consistent caching layer requires careful thought about invalidation, eviction, failure modes, and hot key problems. This deep dive goes beyond “add Redis” to cover the actual design decisions.
Cache Strategies
"""
Cache-Aside (Lazy Loading) — most common:
Read: check cache → miss → read DB → write cache → return
Write: write DB → invalidate (or update) cache
Pros: only caches what is actually read; resilient (cache fails gracefully)
Cons: cold start misses; stale data risk; cache-DB inconsistency window
Write-Through:
Write: write cache → write DB (synchronously)
Pros: cache always fresh; read always hits
Cons: write latency (two writes); cache may store data never read
Write-Behind (Write-Back):
Write: write cache immediately → DB write is async (batched/delayed)
Pros: very low write latency; batching reduces DB load
Cons: data loss risk if cache crashes before DB write; complex implementation
Read-Through:
Cache is in front of DB; cache handles DB reads on miss (not application)
Application only talks to cache; cache fetches from DB on miss
Good when using a cache-as-a-service (e.g., Amazon ElastiCache with DAX)
Refresh-Ahead:
Proactively refresh entries before TTL expires (predict what will be needed)
Pros: no cold misses for popular items
Cons: may refresh items that won t be read again; complexity
"""
import redis
import json
import time
from typing import Optional, Callable, TypeVar
T = TypeVar("T")
class CacheAside:
def __init__(self, redis_client: redis.Redis, default_ttl: int = 3600):
self.cache = redis_client
self.default_ttl = default_ttl
def get(self, key: str, fetch_fn: Callable[[], T], ttl: int = None) -> T:
"""
Cache-aside with dogpile (thundering herd) prevention.
Uses a soft TTL + lock to prevent cache stampede.
"""
ttl = ttl or self.default_ttl
raw = self.cache.get(key)
if raw is not None:
return json.loads(raw)
# Cache miss: try to acquire a short lock before fetching
lock_key = f"lock:{key}"
acquired = self.cache.set(lock_key, "1", nx=True, ex=5) # 5s lock
if acquired:
# We hold the lock: fetch and populate cache
try:
data = fetch_fn()
self.cache.setex(key, ttl, json.dumps(data))
return data
finally:
self.cache.delete(lock_key)
else:
# Another thread is fetching: wait briefly and retry
time.sleep(0.05)
raw = self.cache.get(key)
if raw is not None:
return json.loads(raw)
return fetch_fn() # Fallback: fetch without cache
def invalidate(self, key: str):
self.cache.delete(key)
def invalidate_pattern(self, pattern: str):
"""Invalidate all keys matching a pattern. Use cautiously."""
for key in self.cache.scan_iter(pattern, count=100):
self.cache.delete(key)
Redis vs Memcached
| Feature | Redis | Memcached |
|---|---|---|
| Data types | String, List, Set, Sorted Set, Hash, Stream, Bitmap, HLL | String only |
| Persistence | RDB snapshots + AOF log | None (memory only) |
| Replication | Primary-replica + Sentinel + Cluster | No built-in (client-side sharding) |
| Pub/Sub | Yes | No |
| Lua scripting | Yes (atomic multi-op) | No |
| Multi-threading | I/O multi-threaded (Redis 6+); commands single-threaded | Fully multi-threaded |
| Memory efficiency | Slightly less (overhead per key) | Better for simple string KV |
| Use when | Rich data structures, persistence, Lua atomicity | Pure caching, maximum throughput, simple values |
Eviction Policies and Cache Sizing
"""
Redis eviction policies (set via maxmemory-policy):
noeviction: Return error when full. Use when losing data is unacceptable.
allkeys-lru: Evict least recently used key from all keys. Good default.
volatile-lru: Evict LRU key from keys with TTL set. For mixed TTL/non-TTL.
allkeys-lfu: Evict least frequently used. Better for Zipf-distributed access.
volatile-lfu: LFU from keys with TTL.
allkeys-random: Random eviction. Almost never right.
volatile-ttl: Evict key with shortest remaining TTL. Predictable expiry.
Choosing eviction policy:
Hot-key workload (few keys accessed often): LFU (keeps hot keys in cache)
General cache (mixed access): LRU
Bounded cache (fixed TTL keys): volatile-lru or volatile-ttl
Cache sizing formula:
Memory needed = expected_hit_items * avg_item_size / hit_ratio_target
Example: 10M user profiles, avg 2KB each, target 80% hit rate
Active dataset = 10M * 0.80 = 8M profiles
Memory = 8M * 2KB = 16GB of cache
Rule of thumb: cache the working set (20% of data = 80% of reads, Pareto).
"""
# LRU cache implementation (for interview) — O(1) get and put
from collections import OrderedDict
class LRUCache:
def __init__(self, capacity: int):
self.cap = capacity
self.cache = OrderedDict()
def get(self, key: int) -> int:
if key not in self.cache:
return -1
self.cache.move_to_end(key) # Mark as recently used
return self.cache[key]
def put(self, key: int, value: int):
if key in self.cache:
self.cache.move_to_end(key)
self.cache[key] = value
if len(self.cache) > self.cap:
self.cache.popitem(last=False) # Remove least recently used
# LFU cache — O(1) using frequency buckets
class LFUCache:
def __init__(self, capacity: int):
from collections import defaultdict
self.cap = capacity
self.min_freq = 0
self.key_val = {} # key -> value
self.key_freq = {} # key -> frequency
self.freq_keys = defaultdict(OrderedDict) # freq -> {key: None} (insertion order)
def get(self, key: int) -> int:
if key not in self.key_val:
return -1
self._increment_freq(key)
return self.key_val[key]
def put(self, key: int, value: int):
if self.cap = self.cap:
# Evict LFU + LRU key
evict_key, _ = self.freq_keys[self.min_freq].popitem(last=False)
del self.key_val[evict_key]
del self.key_freq[evict_key]
self.key_val[key] = value
self.key_freq[key] = 1
self.freq_keys[1][key] = None
self.min_freq = 1
def _increment_freq(self, key: int):
freq = self.key_freq[key]
del self.freq_keys[freq][key]
if not self.freq_keys[freq] and freq == self.min_freq:
self.min_freq += 1
self.key_freq[key] = freq + 1
self.freq_keys[freq + 1][key] = None
Hot Key Problem and Solutions
"""
Hot key: a small number of cache keys receive disproportionate traffic.
Example: a viral tweet ID or celebrity user profile → 100k reads/sec
on a single Redis node that handles 50k ops/sec total.
Solutions:
1. Local in-process cache:
App server maintains a small (1000-key) in-memory cache.
Hot keys served from local cache → zero Redis calls.
Stale for up to a few seconds; fine for most use cases.
2. Read-through replicas:
Redis cluster with read replicas. Route hot key reads to replicas.
Load-balance reads: consistent hashing with virtual nodes.
3. Key sharding / fan-out:
Split hot key into N variants: user:123:shard:0 .. user:123:shard:4
Writes update all N shards. Reads pick random shard.
Reduces per-shard load by N.
4. Cache-local deduplication:
At L7 proxy / edge: coalesce concurrent identical requests.
Only one backend request per hot key per time window.
"""
import random
import threading
class LocalL1Cache:
"""Process-local L1 cache to absorb hot-key traffic before hitting Redis."""
def __init__(self, max_size: int = 1000, ttl_s: float = 1.0):
self._cache = {} # key -> (value, expires_at)
self._lock = threading.Lock()
self._max = max_size
self._ttl = ttl_s
def get(self, key: str):
with self._lock:
entry = self._cache.get(key)
if entry and entry[1] > time.monotonic():
return entry[0]
return None
def set(self, key: str, value):
with self._lock:
if len(self._cache) >= self._max:
# Simple eviction: remove random 10%
remove = random.sample(list(self._cache), self._max // 10)
for k in remove:
del self._cache[k]
self._cache[key] = (value, time.monotonic() + self._ttl)
class TwoLevelCache:
"""L1: local in-process. L2: Redis. Reduces hot-key pressure."""
def __init__(self, redis_client, l1_ttl: float = 1.0):
self.l1 = LocalL1Cache(ttl_s=l1_ttl)
self.l2 = redis_client
def get(self, key: str, fetch_fn, ttl: int = 3600):
# L1 hit
val = self.l1.get(key)
if val is not None:
return val
# L2 hit
raw = self.l2.get(key)
if raw:
val = json.loads(raw)
self.l1.set(key, val)
return val
# Both miss: fetch from origin
val = fetch_fn()
self.l2.setex(key, ttl, json.dumps(val))
self.l1.set(key, val)
return val
Companies That Ask This Question
- LinkedIn Engineering Interview Guide
- Meta Engineering Interview Guide
- Airbnb Engineering Interview Guide
- Uber Engineering Interview Guide
- Databricks Interview Guide
Frequently Asked Questions
What is the difference between cache-aside and write-through caching?
Cache-aside (lazy loading): the application checks the cache first; on miss, reads from DB and populates the cache. Simple, resilient (cache failure degrades gracefully), but cold-start misses are expensive and data can become stale. Write-through: every write goes to the cache AND the database synchronously. Cache always has fresh data, no cold start for recently written keys, but write latency doubles and you cache data that may never be read. Cache-aside is more common for read-heavy workloads; write-through for read-after-write consistency requirements.
What is the cache stampede (thundering herd) problem and how do you prevent it?
Cache stampede occurs when a popular cached key expires and many concurrent requests simultaneously miss the cache, all hitting the database at once. Prevention strategies: (1) mutex/lock — only one request refreshes the cache, others wait; (2) probabilistic early expiration — probabilistically refresh before TTL expires (XFetch algorithm); (3) background refresh — a separate thread refreshes keys before they expire; (4) stale-while-revalidate — serve the stale value immediately while refreshing asynchronously. The mutex approach is simplest but creates a bottleneck; background refresh is best for high-traffic keys.
When should you use Redis vs Memcached?
Choose Redis when: you need data structures beyond strings (lists, sets, sorted sets, hashes, streams), persistence (RDB snapshots or AOF), pub/sub messaging, Lua scripting, transactions (MULTI/EXEC), or TTL per key with complex eviction policies. Choose Memcached when: your use case is pure key-value caching, you need simpler horizontal scaling with consistent hashing, or you want lower memory overhead per key. Redis is the default choice for most modern systems — its additional features rarely hurt and often help. Memcached has an edge only at very large scale with simple get/set workloads.
{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “What is the difference between cache-aside and write-through caching?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Cache-aside (lazy loading): the application checks the cache first; on miss, reads from DB and populates the cache. Simple, resilient (cache failure degrades gracefully), but cold-start misses are expensive and data can become stale. Write-through: every write goes to the cache AND the database synchronously. Cache always has fresh data, no cold start for recently written keys, but write latency doubles and you cache data that may never be read. Cache-aside is more common for read-heavy workloads; write-through for read-after-write consistency requirements.”
}
},
{
“@type”: “Question”,
“name”: “What is the cache stampede (thundering herd) problem and how do you prevent it?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Cache stampede occurs when a popular cached key expires and many concurrent requests simultaneously miss the cache, all hitting the database at once. Prevention strategies: (1) mutex/lock — only one request refreshes the cache, others wait; (2) probabilistic early expiration — probabilistically refresh before TTL expires (XFetch algorithm); (3) background refresh — a separate thread refreshes keys before they expire; (4) stale-while-revalidate — serve the stale value immediately while refreshing asynchronously. The mutex approach is simplest but creates a bottleneck; background refresh is best for high-traffic keys.”
}
},
{
“@type”: “Question”,
“name”: “When should you use Redis vs Memcached?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Choose Redis when: you need data structures beyond strings (lists, sets, sorted sets, hashes, streams), persistence (RDB snapshots or AOF), pub/sub messaging, Lua scripting, transactions (MULTI/EXEC), or TTL per key with complex eviction policies. Choose Memcached when: your use case is pure key-value caching, you need simpler horizontal scaling with consistent hashing, or you want lower memory overhead per key. Redis is the default choice for most modern systems — its additional features rarely hurt and often help. Memcached has an edge only at very large scale with simple get/set workloads.”
}
}
]
}