Question 1

What is the staleness window in stale-while-revalidate?

Accepted Answer

The staleness window is the period between when a cache entry's fresh TTL expires and when its stale TTL expires. During this window, the stale value is served immediately and an async refresh is triggered. The staleness window duration equals stale_ttl minus fresh_ttl. After both TTLs expire, the next request blocks on a synchronous DB load.

Question 2

What happens if the cache loader fails on a miss?

Accepted Answer

If the loader function throws an exception or returns null, the cache should not populate an entry and should propagate the error to the caller. Optionally, a negative cache entry (null value with a short TTL) can be stored to prevent a thundering herd of repeated DB queries for a missing key. Retries should use exponential backoff.

Question 3

How does version-based invalidation work in a read-through cache?

Accepted Answer

Each cache entry stores the version_id of the DB row at load time. When the DB row is updated, the new version is broadcast (via pub/sub or direct cache call). The cache checks if its stored version is stale and evicts the entry if so. The next read triggers the loader to fetch the current row. This avoids TTL-based latency while maintaining consistency.

Question 4

Why is consistent hashing important for a read-through cache cluster?

Accepted Answer

Consistent hashing ensures a given key always routes to the same cache node. Without it, multiple nodes might each handle a miss for the same key independently, causing N parallel DB queries and N separate cached copies that can diverge. With consistent hashing, each key has exactly one authoritative node that owns its loader state, preventing redundant loads and ensuring a single cached copy.

Question 5

How does read-through cache populate on a miss?

Accepted Answer

On a cache miss the cache itself (not the application) transparently fetches the value from the backing store, stores it under the requested key with a configured TTL, and returns it to the caller. This keeps cache-population logic centralized in the cache layer rather than scattered across application code.

Question 6

How is stale-while-revalidate implemented?

Accepted Answer

The cache returns the stale (expired) entry immediately to the caller while triggering an asynchronous background refresh against the backing store, so the next request receives a fresh value with no added latency. A per-key lock or CAS flag ensures only one background refresh is in flight at a time, preventing redundant revalidation requests.

Question 7

How are thundering herds prevented on cache miss?

Accepted Answer

A per-key mutex or probabilistic early expiration (PER) pattern ensures that only the first request triggers a backing-store fetch while concurrent requests for the same key block or receive the stale value. Distributed cache systems use a lock token stored atomically (e.g., via Redis SET NX) so only one node performs the fetch across the entire fleet.

Question 8

How does read-through handle TTL expiry?

Accepted Answer

When the TTL of a cached entry expires, the next read triggers a synchronous or asynchronous reload from the backing store using the same read-through path as a cold miss. To avoid latency spikes at expiry boundaries, TTL values are often jittered (randomized within a range) so cache entries for different keys expire at staggered times rather than all at once.

Read-Through Cache Low-Level Design: Cache Population, Stale-While-Revalidate, and Consistency Patterns

What Is a Read-Through Cache?

Cache Loader: Pluggable Population

Stale-While-Revalidate

Versioned Cache Entries

Consistent Hashing Across Cache Nodes

Background Pre-Fetch

SQL Schema

Python Implementation Sketch

Consistency with Write Operations