Low Level Design: Vector Clock Service

Vector Clock Service

Vector clocks track causality in distributed systems without requiring synchronized physical clocks. Each node maintains a vector of logical timestamps — one counter per node in the system.

Vector Clock Mechanics

VC = [VC[0], VC[1], ..., VC[n-1]]
// VC[i] = number of events node i has caused

Rules

Increment: Before a send event or internal event, node i increments VC[i].
Send: Attach the current VC to the message.
Receive / Merge: For each component j, set VC[j] = max(local_VC[j], received_VC[j]), then increment own counter VC[i]++.

Causality Detection

Happened-before (A → B): VC(A) <= VC(B) — all components of A are less than or equal to the corresponding components of B, with at least one strictly less.
Concurrent (A ∥ B): neither VC(A) <= VC(B) nor VC(B) <= VC(A).
Identical: same event or causally equivalent.

Data Model

VersionedObject {
    key:          string       PRIMARY KEY
    value:        bytes
    vector_clock: JSONB        // {"node_id": counter, ...}
    updated_by:   string       // node_id of last writer
    updated_at:   timestamp
}

Conflict Detection on Write

Client sends write with its current VC.
Server reads stored VC_stored.
Compare: if VC_client >= VC_stored (client has seen everything stored), write is safe — update and merge clocks.
If concurrent (neither dominates), a conflict exists: store both versions as sibling values.

Conflict Resolution Strategies

Last-Write-Wins (LWW): use wall clock timestamp as tiebreaker. Simple but loses updates.
Application-level merge: e.g., shopping cart union — take union of all sibling item sets.
CRDT: use a data structure that merges automatically (see LLD: CRDT).
Client-side reconciliation: return all siblings to client; client resolves and writes back.

Dotted Version Vectors

Standard VCs can't distinguish between concurrent writes and causally-related ones when siblings are present. Dotted Version Vectors (DVVs) attach a "dot" (node, counter) to each value, enabling precise tracking of which write introduced each sibling.

DVV = { dot: (node, counter), context: VC }

Practical Implementation: Amazon Dynamo-style KV Store

Each replica maintains its own VC component.
On read, coordinator gathers responses from R replicas; returns all siblings if VCs are concurrent.
On write, coordinator increments its own VC component and writes to W replicas.
Background anti-entropy: replicas gossip and merge state using VC comparison to detect and repair divergence.

Key Design Decisions

VC size: grows with number of nodes. Prune stale entries using a garbage collection epoch when nodes leave.
Storage: store VC as JSONB for flexibility; index on key for fast reads.
Clock skew: VCs are immune to clock skew — they use logical, not physical, time.
Read repair: on read, if coordinator detects concurrent siblings, trigger a reconciliation write.

{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “What is a vector clock and what problem does it solve?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “A vector clock is an array of logical counters, one per node, that tracks causality in distributed systems without requiring synchronized physical clocks. It solves the problem of determining whether one event happened before another (causality) or whether two events occurred concurrently, which is essential for detecting and resolving write conflicts in distributed databases.”
}
},
{
“@type”: “Question”,
“name”: “How do you detect concurrent events using vector clocks?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Event A happened-before event B if every component of VC(A) is less than or equal to the corresponding component of VC(B), with at least one strictly less. Two events are concurrent if neither dominates the other — A's VC has some components greater than B's and some less, meaning they have no causal relationship.”
}
},
{
“@type”: “Question”,
“name”: “How are vector clock conflicts resolved in a distributed key-value store?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “When a write arrives with a vector clock that is concurrent with the stored version, both versions are kept as siblings. Resolution strategies include last-write-wins using a wall clock tiebreaker, application-level merge (e.g., union of shopping cart items), CRDTs that merge automatically, or returning all siblings to the client for manual reconciliation.”
}
},
{
“@type”: “Question”,
“name”: “What are Dotted Version Vectors and why are they better than plain vector clocks?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Dotted Version Vectors (DVVs) attach a dot — a (node, counter) pair — to each stored value in addition to the causal context vector clock. This allows the system to precisely track which write introduced each sibling value, preventing the false concurrency problem that arises with plain vector clocks when multiple siblings accumulate over time.”
}
}
]
}