CRDT: Conflict-Free Replicated Data Types
CRDTs are data structures that guarantee strong eventual consistency: any two replicas that have received the same set of updates will converge to the same state, regardless of the order in which updates were applied.
Two CRDT Families
State-based CRDTs (CvRDT — Convergent)
Replicas periodically exchange their full state. The merge function must be:
- Commutative: merge(A, B) = merge(B, A)
- Associative: merge(merge(A, B), C) = merge(A, merge(B, C))
- Idempotent: merge(A, A) = A
Network requirement: at-least-once delivery of state is sufficient.
Operation-based CRDTs (CmRDT — Commutative)
Replicas exchange operations instead of full state (more bandwidth-efficient). Requirements:
- Operations must be delivered exactly once to each replica.
- Concurrent operations must be commutative (order doesn't matter).
G-Counter (Grow-Only Counter)
State: counters[node_id] = int // per-node counter array
Increment(node_id): counters[node_id]++
Value(): sum(counters)
Merge(A, B): for each i, result[i] = max(A[i], B[i])
Guarantees: only increments; no decrement possible. Merge is idempotent.
PN-Counter (Positive-Negative Counter)
State: P = G-Counter // increments
N = G-Counter // decrements
Increment(): P.increment(node_id)
Decrement(): N.increment(node_id)
Value(): P.value() - N.value()
Merge(A, B): merge both P and N G-Counters separately
OR-Set (Observed-Remove Set)
Supports add and remove of elements. Resolves add/remove conflicts in favor of add.
State: adds = set of (element, unique_tag)
removes = set of unique_tag // tombstones
Add(e): generate unique_tag; adds.insert((e, unique_tag))
Remove(e): for each (e, tag) in adds, removes.insert(tag)
Contains(e): exists (e, tag) in adds where tag not in removes
Merge(A, B): adds = A.adds ∪ B.adds
removes = A.removes ∪ B.removes
LWW-Register (Last-Write-Wins Register)
State: value = any
timestamp = int // logical or physical timestamp
Write(v, ts): if ts > timestamp then value=v, timestamp=ts
Read(): value
Merge(A, B): keep the one with higher timestamp
Simple and efficient; risks data loss if timestamps are not monotonic or two writes have the same timestamp.
RGA (Replicated Growable Array)
Used for collaborative text editing. Each character has a globally unique ID and a predecessor ID:
Character {
id: (node_id, sequence_number)
predecessor: character_id // insert after this character
value: char
tombstone: bool // true if deleted
}
Insert and delete operations are commutative. The document is reconstructed by ordering characters by their predecessor relationships and using unique IDs to break ties deterministically.
Persistence and Replication
- Storage: serialize CRDT state as JSONB in a relational DB or document store.
- Gossip protocol: nodes periodically exchange state with random peers; each merge converges toward the global state.
- Delta-CRDTs: instead of shipping full state, ship only the "delta" — the minimal state that represents recent changes — reducing bandwidth.
Choosing the Right CRDT
- Counters: G-Counter or PN-Counter (metrics, likes, inventory adjustments).
- Sets: OR-Set (tags, memberships, shopping cart items).
- Registers: LWW-Register (user profile fields, configuration).
- Sequences: RGA or LSEQ (collaborative document editing).
- Maps: OR-Map (recursive CRDT — each key maps to another CRDT).
Key Design Decisions
- State-based vs op-based: prefer state-based for simplicity; op-based for bandwidth-constrained environments.
- Tombstone accumulation: OR-Set tombstones grow unboundedly — implement periodic garbage collection once all replicas have seen the removal.
- Unique tag generation: use (node_id, Lamport timestamp) or UUID to guarantee global uniqueness without coordination.
- Causal consistency: pair CRDTs with vector clocks to ensure operations are only applied after their causal dependencies.
See also: Atlassian Interview Guide
See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering