Why Caching?
A cache stores copies of frequently-accessed data in fast storage (RAM) to avoid expensive re-computation or round trips to slower storage (DB, external API). Rule of thumb: if a read takes > 10ms and happens more than 100 times/second, cache it. Caching reduces DB load, cuts p99 latency, and enables horizontal scaling without scaling the database.
Cache Eviction Policies
- LRU (Least Recently Used): evict the item accessed least recently. Best for general-purpose caches where recently accessed items are more likely to be accessed again. Implemented with a doubly-linked list + hashmap (O(1) get/put). Used by Redis default.
- LFU (Least Frequently Used): evict the item accessed fewest times. Better than LRU when access frequency is a better predictor than recency (e.g., popular items stay in cache even if not accessed in the last minute). Harder to implement efficiently.
- FIFO: evict the oldest item. Simple but ignores access patterns.
- TTL (Time-to-Live): evict items after a fixed expiry. Used for data with a natural staleness bound (session tokens, DNS records, rate limit counters).
LRU Cache Implementation (LC 146)
from collections import OrderedDict
class LRUCache:
def __init__(self, capacity):
self.cap = capacity
self.cache = OrderedDict() # maintains insertion/access order
def get(self, key):
if key not in self.cache: return -1
self.cache.move_to_end(key) # mark as recently used
return self.cache[key]
def put(self, key, value):
if key in self.cache:
self.cache.move_to_end(key)
self.cache[key] = value
if len(self.cache) > self.cap:
self.cache.popitem(last=False) # evict least recently used (front)
Cache Write Policies
- Write-through: write to cache AND DB synchronously. Cache is always consistent with DB. Slower writes (wait for both). Good for read-heavy with important consistency (user profile).
- Write-behind (write-back): write to cache immediately, write to DB asynchronously. Faster writes. Risk: data loss if cache fails before DB write. Good for counters, analytics.
- Cache-aside (lazy loading): app reads from cache; on miss, reads from DB and populates cache. App writes directly to DB, invalidates/updates cache. Most flexible. Cache may serve stale data between write and invalidation.
- Read-through: cache sits in front of DB. On miss, cache fetches from DB. App only talks to cache. Simpler app logic; cache must understand the data model.
Cache Invalidation Strategies
Cache invalidation is notoriously hard. Options:
- TTL expiry: simplest. Cache is eventually consistent — staleness bounded by TTL. No coordination needed.
- Event-driven invalidation: on DB write, publish an invalidation event (Kafka); cache consumers delete the key. Near-real-time consistency. Requires event infrastructure.
- Write-through: update cache on every DB write. Always consistent, but every write touches the cache (acceptable for low write volume).
Best practice: combine TTL (safety net for missed events) with event-driven invalidation (near-real-time) for important caches.
Distributed Cache Architecture
App servers → Redis Cluster (consistent hashing across N nodes)
Each node: primary + 1 replica (Redis Sentinel / Redis Cluster)
Consistent hashing: add/remove nodes with minimal key redistribution
Consistent hashing places nodes on a virtual ring. Each key hashes to a position on the ring; it is stored on the next clockwise node. Adding a node only moves keys from the successor node. Removing a node moves its keys to the successor. Minimizes cache misses during cluster scaling.
Cache Stampede (Thundering Herd)
When a popular key expires, thousands of requests all miss the cache simultaneously and hammer the DB. Solutions: (1) Probabilistic early expiration: before the TTL expires, proactively refresh the cache with a small probability (increases with proximity to expiry). (2) Mutex lock on miss: only one request fetches from DB; others wait or return slightly stale data. (3) Background refresh: a background job refreshes popular keys before they expire.
Key Design Decisions
- LRU for general-purpose caches; TTL for time-bounded data
- Cache-aside for most web applications — flexible, easy to reason about
- Redis Cluster with consistent hashing for distributed horizontal scaling
- TTL + event-driven invalidation combined — TTL as backstop, events for freshness
- Mutex or probabilistic refresh to prevent cache stampede on popular keys
Uber system design covers distributed caching and Redis. See common questions for Uber interview: caching and Redis system design.
Amazon system design covers caching layers and distributed cache. Review patterns for Amazon interview: cache system design and DynamoDB.
Databricks system design covers caching and data access patterns. See design patterns for Databricks interview: caching and data layer design.