Caching Strategies: Cache-Aside, Write-Through, and Beyond

Caching is the single highest-leverage optimization in most system designs. Before you reach for database sharding or complex infrastructure, a well-placed cache can reduce database load by 90% and cut response times from hundreds of milliseconds to single digits. Interviewers know this, and they expect you to know it too.

The question usually comes after you’ve described a system: “Your database is getting hammered by reads. How do you scale it?” The right first answer is almost always a cache.

Strategy

Don’t just say “add a cache.” Walk through where the cache sits, what gets cached, how data gets in and out, and what happens when cache and database disagree. These are the four caching patterns. Know them by name.

The Four Caching Patterns

1. Cache-Aside (Lazy Loading)

The application code manages the cache directly. On a read:

Check the cache. If hit → return data.
If miss → query the database, store result in cache, return data.

def get_user(user_id):
    # 1. Check cache
    user = cache.get(f"user:{user_id}")
    if user:
        return user

    # 2. Cache miss — go to DB
    user = db.query("SELECT * FROM users WHERE id = ?", user_id)

    # 3. Populate cache for next time
    cache.set(f"user:{user_id}", user, ttl=3600)
    return user

Pros: Only caches what’s actually requested. Cache failures are non-fatal — the app falls back to the DB gracefully. Works well for read-heavy workloads with uneven access patterns.

Cons: First request after a cache miss (or expiry) is slow. Under high load, many simultaneous cache misses for the same key cause a thundering herd — dozens of requests all hitting the DB simultaneously. Fix with cache stampede protection (mutex lock or probabilistic early expiration).

When to use: Most web applications. User profiles, product catalogs, content feeds — anything where data is read far more than written.

2. Write-Through

Every write goes to the cache and the database synchronously, in the same operation.

def update_user(user_id, data):
    # Write to DB first
    db.execute("UPDATE users SET ... WHERE id = ?", user_id, data)
    # Then update cache
    cache.set(f"user:{user_id}", data, ttl=3600)

Pros: Cache is always warm and consistent with the DB. No cache misses after writes. Good for write-heavy workloads where you want reads to always hit cache.

Cons: Every write pays double latency (DB + cache). You cache data that may never be read again (write-heavy, read-light data wastes cache space). Use TTLs to evict stale entries.

When to use: Systems where data consistency between cache and DB is critical and you can afford slightly slower writes. User preference settings, configuration data.

3. Write-Behind (Write-Back)

Write to the cache immediately, acknowledge the write to the client, and asynchronously flush to the database later.

def update_user(user_id, data):
    cache.set(f"user:{user_id}", data)          # fast, synchronous
    write_queue.push({"user_id": user_id, "data": data})  # async
    return "OK"

# Background worker
def flush_worker():
    while True:
        item = write_queue.pop()
        db.execute("UPDATE users SET ... WHERE id = ?", item["user_id"], item["data"])

Pros: Extremely fast writes — the client doesn’t wait for the DB. Great for write-heavy workloads (gaming leaderboards, analytics counters, IoT sensor data).

Cons: Data loss risk — if the cache crashes before flushing, writes are lost. The async queue is a new failure point. Harder to implement correctly.

When to use: High-throughput write scenarios where occasional data loss is acceptable, or where you batch many small writes into fewer large DB writes (counter aggregation, view counts).

4. Read-Through

The cache sits in front of the database as a transparent proxy. The application only talks to the cache; the cache fetches from the DB on a miss automatically.

# Application code — clean, no DB logic
user = cache.get(f"user:{user_id}")
# Cache handles the miss internally, populates itself, returns data

Pros: Cleaner application code — no explicit cache management logic. Same behavior as cache-aside but encapsulated in the cache layer.

Cons: First request for any key is always slow (cold start). Requires a cache that supports read-through (Redis doesn’t natively; you’d build this or use a proxy like Twemproxy). Harder to debug cache misses.

When to use: When you want to abstract caching completely from application code. Common in ORM-level caching (Hibernate second-level cache, Django’s cache framework).

Cache Eviction Policies

When the cache is full, something must go. The policy determines what:

LRU (Least Recently Used): Evict the key that hasn’t been accessed for the longest time. Default for most caches. Best general-purpose choice.
LFU (Least Frequently Used): Evict the key accessed the fewest times. Better when some keys are accessed constantly (hot keys) and others are accessed in bursts but then forgotten.
FIFO: Evict the oldest-inserted key regardless of access. Simple but rarely optimal.
Random: Evict a random key. Surprisingly decent in practice; avoids the overhead of tracking access order.
TTL-based: Keys expire after a set time. Not an eviction policy per se, but layered on top of LRU/LFU in production Redis deployments.

Redis default is noeviction (returns errors when full) — you almost always want to set this to allkeys-lru or volatile-lru in production.

Redis vs. Memcached

This is a common follow-up question:

Choose Redis when you need persistence, replication, rich data structures (sorted sets for leaderboards, pub/sub for notifications), or atomic operations.

Choose Memcached when you have a pure caching workload at very high throughput and want the simplicity of a multi-threaded daemon with no persistence overhead. In 2026, most teams default to Redis.

Cache Problems You Must Know

Cache Stampede (Thundering Herd): A popular key expires; hundreds of simultaneous requests all miss the cache and hammer the DB. Fix: use a mutex (only one thread refills the cache), or probabilistic early expiration (start refreshing before the TTL expires, randomly).

Cache Penetration: Requests for keys that don’t exist in the cache or the DB (often malicious). Every miss hits the DB. Fix: cache negative results (cache.set("user:9999", null, ttl=60)), or use a Bloom filter to reject clearly nonexistent keys before they reach the cache.

Cache Avalanche: Many cache keys expire simultaneously (e.g., you just deployed and set all TTLs to 3600s at the same second). The DB gets flooded. Fix: add jitter to TTLs (ttl = 3600 + random.randint(-300, 300)).

Hot Key Problem: One key is so popular that a single cache shard is overwhelmed. Fix: replicate the hot key across multiple shards with a suffix (user:1:0, user:1:1, …), read from a random replica.

What to Cache and What Not To

Cache: read-heavy data that changes infrequently — user profiles, product details, rendered HTML, API responses, session tokens, computed aggregates.

Don’t cache: data that must be strongly consistent in real time (bank balances, inventory for flash sales), data too large to fit in memory, data accessed so rarely that a cache miss is acceptable.

Summary

Cache-aside is the default for most applications: lazy, resilient, and easy to reason about. Write-through keeps the cache hot at the cost of slower writes. Write-behind maximizes write throughput at the cost of durability. Read-through abstracts caching from the application. LRU eviction and Redis are the industry defaults. The three failure modes to know — stampede, penetration, and avalanche — each have standard fixes. State all of this confidently in an interview and you’ll stand out from candidates who just say “add Redis.”