Write-Behind Cache Low-Level Design: Async Persistence, Durability Guarantees, and Failure Recovery

What Is a Write-Behind Cache?

A write-behind cache (also called write-back cache) acknowledges writes to the client immediately after updating the cache, then flushes data to the database asynchronously. The primary goal is removing synchronous DB writes from the hot request path, dramatically reducing write latency for high-throughput workloads.

The tradeoff is durability: if the cache crashes before a flush completes, unsynced writes are lost. Mitigating this risk is the central design challenge of a write-behind system.

Core Write Flow

On every write:

  1. The application calls cache.write(key, value).
  2. The cache stores the value in memory and marks the entry as dirty.
  3. The cache appends a record to a Write-Ahead Log (WAL) on durable storage before acknowledging the client.
  4. The client receives a success response. The DB has not yet been updated.

The WAL entry is written synchronously to disk before the ack. This ensures that even on a crash, the write is recoverable. The cache stays fast because sequential WAL appends are far cheaper than random DB writes.

Flush Strategy

A background flush worker periodically drains dirty entries to the database. Two common strategies:

  • Interval-based: flush all dirty entries every N seconds (e.g., every 5 seconds). Simple to implement; staleness bounded by interval.
  • Count-based: flush when dirty entry count exceeds threshold (e.g., 10,000 dirty keys). Bounds memory pressure. Often combined with interval-based for a belt-and-suspenders approach.

Write Coalescing

If the same key is written multiple times before a flush cycle, only the latest value needs to be flushed. The cache maintains a dirty map keyed by cache key; each new write simply overwrites the pending value in that map. This last-write-wins coalescing collapses N updates to one DB write, which is especially valuable for counter increments, leaderboard scores, and analytics events that update at high frequency.

Coalescing semantics must be documented to callers: intermediate values are never persisted. If intermediate states matter (e.g., audit logs), write-behind is not appropriate.

Failure Recovery via WAL Replay

On cache restart after a crash:

  1. Open the WAL file and read all records with no corresponding flushed_at confirmation.
  2. Re-apply each pending write to the in-memory cache.
  3. Trigger a flush of all recovered dirty entries.

WAL records are marked as flushed only after the DB write succeeds. This gives at-least-once flush semantics — the DB write may be retried and must be idempotent (upsert, not blind insert).

Conflict Detection on Flush

A concurrent writer (another process, or a direct DB write) may have updated the DB row between when the cache entry was written and when the flush happens. To handle this:

  • Store a version or updated_at alongside the cached value at write time.
  • On flush, use an optimistic locking UPDATE with a WHERE clause checking the expected version.
  • If zero rows updated, the row was concurrently modified — log the conflict, and choose a resolution strategy (last-writer-wins, alert, or discard).

SQL Schema

-- Dirty entries waiting to be flushed
CREATE TABLE WriteBehindEntry (
    cache_key       TEXT PRIMARY KEY,
    value           JSONB        NOT NULL,
    dirty_since     TIMESTAMPTZ  NOT NULL DEFAULT now(),
    flush_attempts  INT          NOT NULL DEFAULT 0,
    flushed_at      TIMESTAMPTZ
);

-- Write-ahead log for crash recovery
CREATE TABLE WALRecord (
    seq         BIGSERIAL    PRIMARY KEY,
    cache_key   TEXT         NOT NULL,
    value       JSONB        NOT NULL,
    wal_at      TIMESTAMPTZ  NOT NULL DEFAULT now(),
    flushed_at  TIMESTAMPTZ
);

CREATE INDEX idx_wal_unflushed ON WALRecord (wal_at) WHERE flushed_at IS NULL;

Python Implementation Sketch

import json, time, threading
from collections import defaultdict

class WriteBehindCache:
    def __init__(self, db, wal_path, flush_interval=5):
        self.db = db
        self.wal_path = wal_path
        self.flush_interval = flush_interval
        self.dirty: dict[str, dict] = {}
        self.lock = threading.Lock()
        self._open_wal()
        self._start_flush_thread()

    def _open_wal(self):
        self.wal = open(self.wal_path, 'a')

    def write(self, key: str, value: dict):
        entry = {'key': key, 'value': value, 'ts': time.time()}
        self.wal.write(json.dumps(entry) + 'n')
        self.wal.flush()
        with self.lock:
            self.dirty[key] = value

    def flush_dirty_entries(self):
        with self.lock:
            snapshot = dict(self.dirty)
        for key, value in snapshot.items():
            self._flush_one(key, value)
        with self.lock:
            for key in snapshot:
                if self.dirty.get(key) == snapshot[key]:
                    del self.dirty[key]

    def _flush_one(self, key: str, value: dict):
        self.db.execute(
            "INSERT INTO target_table (key, value, updated_at) VALUES (%s, %s, now()) "
            "ON CONFLICT (key) DO UPDATE SET value = EXCLUDED.value, updated_at = EXCLUDED.updated_at",
            (key, json.dumps(value))
        )
        self.db.execute(
            "UPDATE WALRecord SET flushed_at = now() WHERE cache_key = %s AND flushed_at IS NULL",
            (key,)
        )

    def replay_wal(self):
        with open(self.wal_path, 'r') as f:
            for line in f:
                entry = json.loads(line.strip())
                with self.lock:
                    if entry['key'] not in self.dirty:
                        self.dirty[entry['key']] = entry['value']
        self.flush_dirty_entries()

    def coalesce_writes(self, key: str):
        # Returns the latest coalesced value for a key without flushing
        with self.lock:
            return self.dirty.get(key)

    def _start_flush_thread(self):
        def loop():
            while True:
                time.sleep(self.flush_interval)
                self.flush_dirty_entries()
        t = threading.Thread(target=loop, daemon=True)
        t.start()

Use Cases

  • High-write-rate counters: page view counts, API call tallies — intermediate values irrelevant, only final count matters.
  • Analytics events: buffering events before bulk insert into a data warehouse.
  • Leaderboard scores: score updates are frequent; only the latest score matters per flush cycle.

When Not to Use Write-Behind

Avoid write-behind when intermediate state must be durable (financial transactions, inventory decrements), when data loss on cache failure is unacceptable, or when the system lacks a reliable WAL infrastructure. For those cases, write-through or write-around are safer choices.

See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering

See also: Netflix Interview Guide 2026: Streaming Architecture, Recommendation Systems, and Engineering Excellence

See also: Anthropic Interview Guide 2026: Process, Questions, and AI Safety

Scroll to Top