Low Level Design: CRDT (Conflict-Free Replicated Data Types)

CRDT: Conflict-Free Replicated Data Types

CRDTs are data structures that guarantee strong eventual consistency: any two replicas that have received the same set of updates will converge to the same state, regardless of the order in which updates were applied.

Two CRDT Families

State-based CRDTs (CvRDT — Convergent)

Replicas periodically exchange their full state. The merge function must be:

  • Commutative: merge(A, B) = merge(B, A)
  • Associative: merge(merge(A, B), C) = merge(A, merge(B, C))
  • Idempotent: merge(A, A) = A

Network requirement: at-least-once delivery of state is sufficient.

Operation-based CRDTs (CmRDT — Commutative)

Replicas exchange operations instead of full state (more bandwidth-efficient). Requirements:

  • Operations must be delivered exactly once to each replica.
  • Concurrent operations must be commutative (order doesn't matter).

G-Counter (Grow-Only Counter)

State:  counters[node_id] = int  // per-node counter array
Increment(node_id): counters[node_id]++
Value(): sum(counters)
Merge(A, B): for each i, result[i] = max(A[i], B[i])

Guarantees: only increments; no decrement possible. Merge is idempotent.

PN-Counter (Positive-Negative Counter)

State:  P = G-Counter  // increments
        N = G-Counter  // decrements
Increment(): P.increment(node_id)
Decrement(): N.increment(node_id)
Value(): P.value() - N.value()
Merge(A, B): merge both P and N G-Counters separately

OR-Set (Observed-Remove Set)

Supports add and remove of elements. Resolves add/remove conflicts in favor of add.

State:  adds    = set of (element, unique_tag)
        removes = set of unique_tag  // tombstones
Add(e): generate unique_tag; adds.insert((e, unique_tag))
Remove(e): for each (e, tag) in adds, removes.insert(tag)
Contains(e): exists (e, tag) in adds where tag not in removes
Merge(A, B): adds = A.adds ∪ B.adds
             removes = A.removes ∪ B.removes

LWW-Register (Last-Write-Wins Register)

State:  value     = any
        timestamp = int  // logical or physical timestamp
Write(v, ts): if ts > timestamp then value=v, timestamp=ts
Read(): value
Merge(A, B): keep the one with higher timestamp

Simple and efficient; risks data loss if timestamps are not monotonic or two writes have the same timestamp.

RGA (Replicated Growable Array)

Used for collaborative text editing. Each character has a globally unique ID and a predecessor ID:

Character {
    id:          (node_id, sequence_number)
    predecessor: character_id  // insert after this character
    value:       char
    tombstone:   bool          // true if deleted
}

Insert and delete operations are commutative. The document is reconstructed by ordering characters by their predecessor relationships and using unique IDs to break ties deterministically.

Persistence and Replication

  • Storage: serialize CRDT state as JSONB in a relational DB or document store.
  • Gossip protocol: nodes periodically exchange state with random peers; each merge converges toward the global state.
  • Delta-CRDTs: instead of shipping full state, ship only the "delta" — the minimal state that represents recent changes — reducing bandwidth.

Choosing the Right CRDT

  • Counters: G-Counter or PN-Counter (metrics, likes, inventory adjustments).
  • Sets: OR-Set (tags, memberships, shopping cart items).
  • Registers: LWW-Register (user profile fields, configuration).
  • Sequences: RGA or LSEQ (collaborative document editing).
  • Maps: OR-Map (recursive CRDT — each key maps to another CRDT).

Key Design Decisions

  • State-based vs op-based: prefer state-based for simplicity; op-based for bandwidth-constrained environments.
  • Tombstone accumulation: OR-Set tombstones grow unboundedly — implement periodic garbage collection once all replicas have seen the removal.
  • Unique tag generation: use (node_id, Lamport timestamp) or UUID to guarantee global uniqueness without coordination.
  • Causal consistency: pair CRDTs with vector clocks to ensure operations are only applied after their causal dependencies.

See also: Atlassian Interview Guide

See also: Netflix Interview Guide 2026: Streaming Architecture, Recommendation Systems, and Engineering Excellence

See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering

Scroll to Top