Causal Consistency Low-Level Design: Vector Clocks, Causal Ordering, and Dependency Tracking

What Is Causal Consistency?

Causal consistency is a consistency model that sits between eventual consistency and linearizability. If operation A causally precedes operation B (A happened-before B), then all processes observe A before B. Concurrent operations — those with no causal relationship — may be seen in any order by different processes. This model allows geographic distribution with low latency while preserving meaningful ordering guarantees.

The Happened-Before Relation

The happened-before relation (→) defines causality in a distributed system. A → B holds if any of the following are true:

A and B occur on the same process and A precedes B in local execution order.
A is a message send event and B is the corresponding receive event.
There exists some C such that A → C and C → B (transitivity).

Events that have no happened-before relationship are considered concurrent. Causal consistency only constrains the ordering of causally related events.

Vector Clocks

Vector clocks are the standard mechanism for tracking causality in distributed systems. Each process maintains a vector of counters, one per known process.

Increment: On any local event, the process increments its own counter in the vector.
Send: When a process sends a message, it attaches its current vector clock.
Receive: On receiving a message with vector clock V_msg, the process takes the component-wise maximum of its local clock and V_msg, then increments its own counter.

Given two vector clocks V1 and V2: V1 causally precedes V2 if every component of V1 is less than or equal to the corresponding component of V2, and at least one component is strictly less. If neither dominates the other, the events are concurrent.

Causal Ordering and Dependency Tracking

A replica enforcing causal consistency must buffer incoming operations until all causally preceding operations have been applied. This is implemented via a pending dependency set.

When an operation arrives at a replica:

Check whether the operation's vector clock dependencies are satisfied by the replica's current applied state.
If satisfied, apply the operation immediately and update the replica's vector clock.
If not satisfied, place the operation in a buffer. Re-evaluate buffered operations whenever a new operation is applied.

This ensures that if write W1 causally precedes write W2, no replica will apply W2 before W1, regardless of network delivery order.

Read-Your-Writes in a Causal Session

Read-your-writes is a specific causal guarantee: after a client writes a value, all subsequent reads by that client must reflect the written value. Implementation:

The client maintains its own write vector clock, updated on every write acknowledgment received from the server.
On subsequent reads, the client includes its current vector clock in the request.
The serving replica delays its response until its applied state is at least as advanced as the client's vector clock (i.e., all components of the replica's applied clock are greater than or equal to the client's clock).
If the replica cannot catch up within a timeout, the request is routed to a more up-to-date replica or the primary.

Causal Metadata Propagation

Causal dependencies must travel through the system with every request and response. Every API call carries the client's current vector clock in a request header or body field. Every response from the server includes the server's current vector clock. Clients merge received clocks into their own state. This ensures that causal information is preserved end-to-end, including through intermediaries such as API gateways or caches.

The cost of causal metadata propagation is O(N) per message where N is the number of nodes. For large clusters this can be significant; techniques like dependency vectors (only tracking direct dependencies) or causal cuts reduce overhead.

Causal Consistency vs. Other Models

Eventual consistency: No ordering guarantees at all; replicas converge eventually. Causal consistency is strictly stronger.
Linearizability: Every operation appears to take effect at a single point in real time; requires coordination on every read and write. Causal consistency allows local reads and asynchronous replication.
Sequential consistency: All processes agree on a total order of operations; does not require real-time constraints. Causal consistency only constrains causally related pairs.

Causal consistency enables geographic distribution and low-latency local reads while preserving the semantics that matter most to users: if you post a comment in response to a post, everyone who sees your comment also sees the original post.

SQL Schema

-- Stores causal events with their vector clock state
CREATE TABLE CausalEvent (
    id            UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    node_id       VARCHAR(64) NOT NULL,
    vector_clock  JSONB NOT NULL,          -- e.g. {"node1": 3, "node2": 1}
    payload       JSONB NOT NULL,
    created_at    TIMESTAMPTZ NOT NULL DEFAULT now()
);

CREATE INDEX idx_causalevent_node ON CausalEvent(node_id);
CREATE INDEX idx_causalevent_clock ON CausalEvent USING GIN(vector_clock);

-- Explicit causal dependency edges between events
CREATE TABLE CausalDependency (
    event_id           UUID NOT NULL REFERENCES CausalEvent(id),
    depends_on_event_id UUID NOT NULL REFERENCES CausalEvent(id),
    PRIMARY KEY (event_id, depends_on_event_id)
);

-- Per-replica applied vector clock state
CREATE TABLE ReplicaState (
    node_id              VARCHAR(64) PRIMARY KEY,
    applied_vector_clock JSONB NOT NULL DEFAULT '{}',
    updated_at           TIMESTAMPTZ NOT NULL DEFAULT now()
);

Python Implementation

import json
from collections import defaultdict
from typing import Dict, List, Optional

VectorClock = Dict[str, int]

class CausalClock:
    def __init__(self, node_id: str):
        self.node_id = node_id
        self.clock: VectorClock = defaultdict(int)
        self.buffer: List[dict] = []

    def tick(self) -> VectorClock:
        """Increment own counter and return current clock."""
        self.clock[self.node_id] += 1
        return dict(self.clock)

    def merge(self, remote_vc: VectorClock) -> VectorClock:
        """Merge a received vector clock (component-wise max) and tick own counter."""
        for node, ts in remote_vc.items():
            self.clock[node] = max(self.clock.get(node, 0), ts)
        self.clock[self.node_id] += 1
        return dict(self.clock)

    def dominates(self, vc: VectorClock) -> bool:
        """Return True if self.clock >= vc on all components."""
        for node, ts in vc.items():
            if self.clock.get(node, 0)  bool:
        """Check if all causal dependencies of event are satisfied by node_state."""
        dep_clock: VectorClock = event.get("vector_clock", {})
        # Exclude the originating node's own counter from the check
        for node, ts in dep_clock.items():
            if node == event.get("node_id"):
                continue
            if node_state.get(node, 0)  Optional[dict]:
        """
        Apply event immediately if dependencies are met, otherwise buffer.
        Returns the event if applied, None if buffered.
        """
        if self.is_ready(event, node_state):
            self.merge(event["vector_clock"])
            # Drain buffer: try to apply buffered events now
            applied = [event]
            changed = True
            while changed:
                changed = False
                still_pending = []
                for buffered in self.buffer:
                    if self.is_ready(buffered, dict(self.clock)):
                        self.merge(buffered["vector_clock"])
                        applied.append(buffered)
                        changed = True
                    else:
                        still_pending.append(buffered)
                self.buffer = still_pending
            return applied
        else:
            self.buffer.append(event)
            return None

FAQ

{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “What is the difference between causal consistency and eventual consistency?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Eventual consistency only guarantees that replicas will converge eventually with no ordering constraints. Causal consistency additionally guarantees that if operation A causally precedes operation B, all processes observe A before B. It is strictly stronger than eventual consistency while still allowing asynchronous replication.”
}
},
{
“@type”: “Question”,
“name”: “Why do vector clocks grow with the number of nodes?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Each vector clock entry tracks the logical time of one node. With N nodes, every clock has N entries. This makes vector clocks O(N) in space and bandwidth. For large clusters, techniques like dependency vectors or interval tree clocks reduce overhead by only tracking direct causal predecessors.”
}
},
{
“@type”: “Question”,
“name”: “How does read-your-writes work in a causal session?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “The client tracks its own write vector clock, updated with each write acknowledgment. On subsequent reads, the client sends this clock to the server. The server delays its response until its applied state dominates the client's clock (all components are at least as large), ensuring the client's writes are visible.”
}
},
{
“@type”: “Question”,
“name”: “What is the overhead of causal metadata propagation?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Causal metadata (vector clocks) adds O(N) bytes per request and response where N is the number of nodes. For small clusters this is negligible. For large clusters, compression, sparse representation (only non-zero entries), or aggregated dependency cuts reduce the overhead significantly.”
}
}
]
}

Causal vs eventual consistency: Causal is strictly stronger; it preserves happened-before ordering while still allowing asynchronous replication.
Vector clock size: O(N) per node; mitigated with sparse clocks or dependency vectors in large clusters.
Read-your-writes in causal session: Client sends its write clock with reads; server waits until its state dominates the client clock.
Dependency metadata overhead: Proportional to cluster size; sparse representation and compression are standard mitigations.

{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “How do vector clocks capture causal ordering?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Each node maintains a vector of logical counters, one per node in the system; when an event occurs the local counter is incremented and when a message is received the vector is merged by taking the element-wise maximum. A write A causally precedes write B if A's vector is component-wise less than or equal to B's vector, giving a partial order that encodes all happened-before relationships.”
}
},
{
“@type”: “Question”,
“name”: “How are causal dependencies tracked across services?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Each request carries a causal token (e.g., a version vector or Lamport timestamp) that is stamped on every write; downstream reads pass the token to the replica, which waits until it has applied all writes up to that version before serving the response. This ensures a service that observed a write will never read a state that does not include that write.”
}
},
{
“@type”: “Question”,
“name”: “How does causal consistency differ from eventual consistency?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Eventual consistency only guarantees that all replicas converge to the same value given no new writes, with no ordering guarantees whatsoever. Causal consistency adds the constraint that operations related by cause-and-effect are seen in that order by all nodes, while concurrent (causally unrelated) operations may still appear in different orders at different replicas.”
}
},
{
“@type”: “Question”,
“name”: “How is causal context propagated in HTTP requests?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “A causal context header (e.g., X-Causal-Token) encodes the client's current version vector and is attached to every outbound HTTP request; the receiving service extracts the token, blocks until its local state satisfies the dependency, and returns an updated token in the response. Clients merge received tokens so the context always reflects the maximum observed state.”
}
}
]
}