Question 1

What is the difference between write-through, write-around, and write-back caching?

Accepted Answer

Three cache write strategies with different consistency and performance trade-offs: (1) Write-through: every write updates both cache and database synchronously before returning to the caller. The cache is always consistent with the database. Write latency = database write latency (no write speedup). Cache always warm for reads. Best for: frequently read data where consistency is critical (user profiles, product catalog). (2) Write-around (write to DB, invalidate cache): writes go directly to the database; the corresponding cache entry is deleted. The next read fetches from the database and repopulates the cache. Avoids caching data that won't be read again. Cache miss on the first read after a write. Best for: write-heavy workloads where data is rarely re-read (log entries, event records). (3) Write-back (write-behind): writes go to cache only; database is updated asynchronously in the background. Lowest write latency (in-memory write). Risk: data loss if the cache node fails before the async database write. Best for: counters and metrics where approximate accuracy is acceptable (view counts, like counts, analytics events). Never use write-back for financial or transactional data.

Question 2

Why is deleting a cache entry safer than updating it on a write?

Accepted Answer

Updating the cache on a database write has a race condition that deleting avoids. Scenario with two concurrent writers: Writer A reads value X=10 from DB, Writer B reads X=10 from DB. Writer A increments to X=11 and writes to DB and cache. Writer B (which started with the stale X=10) increments to X=11 and writes to DB and cache — overwriting A's write. DB has X=11 (correct), but cache also has X=11 — correct by coincidence. Now: Writer A writes X=12 to DB and updates cache to X=12. Meanwhile, Writer B (delayed) finally writes cache to X=11 — overwriting A's update. Cache is now stale (X=11) while DB has X=12. With delete (write-invalidate): Writer A writes X=12 to DB and DELetes the cache key. Writer B (delayed) deletes the cache key. The next read fetches X=12 from DB. DEL is idempotent and has no race condition — multiple concurrent deletes of the same key all produce the same safe result (key is gone). The delete-on-write pattern with TTL as a safety net is the standard recommendation: always delete (never update) the cache after a database write.

Question 3

How do you prevent cache stampede (thundering herd) when a hot cache key expires?

Accepted Answer

Cache stampede: when a popular cache key expires, many concurrent requests simultaneously see a miss and all query the database — a spike that overwhelms the database at exactly the wrong moment. Three prevention strategies: (1) Mutex/lock: only the first request to see the miss acquires a distributed lock and fetches from the database. Other requests wait briefly (spinning with sleep), then re-read the now-populated cache. Implementation: SET lock:{key} 1 NX PX 5000 (set if not exists, 5-second expiry). If SET returns 0 (lock held by another), sleep 50ms and retry the cache read. (2) Stale-while-revalidate: serve the stale cached value to all waiting requests while one background goroutine/thread refreshes the cache from the database. Users see slightly stale data (for one refresh cycle) but no stampede occurs. Implementation: store the item with two TTLs — a soft TTL (when to start refreshing) and a hard TTL (when to stop serving the stale value). (3) Probabilistic early expiration (PER): before the TTL expires, probabilistically trigger cache refreshes — more aggressively as expiry approaches. Formula: if current_time - item_delta * beta * log(random()) > expiry_time, refresh now. This distributes the refresh load over time rather than at one expiry moment.

Question 4

How do cache key versioning and namespaces enable bulk invalidation?

Accepted Answer

Bulk invalidation (invalidating all cache entries for a given entity or domain) is expensive if you must enumerate and delete each key individually. Two approaches: (1) Cache key versioning: prefix all cache keys with a global version number stored in Redis: v = GET global_cache_version. Cache key: v{v}:user:{user_id}. To invalidate all cache entries globally: INCR global_cache_version — the version number changes from 3 to 4. All existing keys (v3:*) are now unreachable (reads use v4:* which miss and repopulate). Old keys expire via their TTL without explicit deletion. Cost: one extra Redis GET per cache read (to fetch the current version). Entity-level versioning: maintain per-entity versions (GET user_version:{user_id}) to invalidate all cache entries for a specific user without touching others. (2) Cache tags / surrogate keys: tag each cache entry with logical identifiers when it's stored. Example: store product:123 with tags ['product-123', 'category-electronics', 'vendor-apple']. To invalidate all electronics products: look up all keys tagged 'category-electronics' and DEL them. Implement the tag index in Redis: SADD tag:category-electronics product:123, product:456, .... On invalidation: SMEMBERS tag:category-electronics → DEL each member. Use a Lua script for atomic scan-and-delete. Varnish's surrogate keys and Fastly's Surrogate-Key header implement this in CDNs.

Cache Invalidation Strategies: Low-Level Design

Cache Write Strategies

Cache Invalidation Patterns

Cache Key Design

Thundering Herd and Cache Stampede

Distributed Cache Consistency