How do HLC-timestamped CDC events enable multi-region data synchronisation?

Hybrid Logical Clocks (HLC) combine a physical wall-clock component with a logical counter, guaranteeing that causally related events are always ordered correctly even when clocks drift across regions. Change Data Capture (CDC) attaches an HLC timestamp to every database mutation before publishing it to the replication stream. Remote regions consume these events and apply them in HLC order, ensuring causal consistency without requiring perfectly synchronised clocks.

How do region tags prevent replication loops in multi-region sync?

Each CDC event carries a set of region tags recording which regions have already processed it. When a region receives an event, it checks whether its own tag is present; if so, it discards the event rather than re-applying it. This prevents the classic bi-directional replication loop where region A sends a change to region B, which re-broadcasts it back to region A. Tags are propagated as-is through the topology, so the mechanism scales to any number of regions without per-pair configuration.

When should you use Last-Write-Wins versus custom merge functions for conflict resolution?

Last-Write-Wins (LWW) selects the event with the highest HLC timestamp when two regions concurrently update the same key. It is simple and correct for scalar values where any recent write is acceptable (e.g., profile picture URL). Custom merge functions are necessary when LWW would lose data: a shopping cart needs a union-merge, a counter needs addition, and a collaborative document needs an operation-based CRDT. The system should allow per-collection or per-key-type conflict resolution policies rather than a single global strategy.

What does a topology configuration API provide in a multi-region sync system?

The topology API lets operators define replication graph edges (which regions replicate to which), per-edge policies (sync vs. async, lag SLOs, bandwidth limits), and failover priorities. It abstracts routing from application code so that adding a new region, switching from hub-and-spoke to full-mesh, or temporarily pausing a link during maintenance is a configuration change rather than a code deployment. The control plane propagates topology updates to all CDC pipeline agents, which adjust their subscription targets accordingly.

Multi-Region Data Sync Low-Level Design: Active-Active Replication, Conflict Resolution, and Topology

⏱ 6 min read

Multi-Region Data Sync Service: Overview and Requirements

A multi-region data sync service replicates writes across geographically distributed datacenters in an active-active configuration, allowing each region to accept writes independently. The core challenges are conflict resolution when concurrent writes modify the same record, bounded replication lag, and transparent failover when a region becomes unreachable.

Functional Requirements

Replicate every committed write to all peer regions within a configurable lag budget.
Resolve write conflicts automatically using last-writer-wins or custom merge functions.
Expose a topology API for adding, removing, and pausing replication links between regions.
Provide lag monitoring with per-link and per-table granularity.
Support schema evolution without halting replication.

Non-Functional Requirements

Steady-state replication lag must be under 500 ms across regions in the same continent and under 2 seconds intercontinentally.
The system must tolerate a region outage of up to 30 minutes and replay the backlog on reconnection without data loss.
Conflict resolution must be deterministic: given the same two conflicting events, all regions must reach the same outcome.
Replication must not hold locks on the source database that would block foreground writes.

Data Model

ChangeEvent: event_id (UUID v7 for ordering), region_id, table_name, row_pk, operation (INSERT | UPDATE | DELETE), before_image, after_image, committed_at, hlc_timestamp.
ReplicationLink: link_id, source_region, target_region, status (active | paused | degraded), last_acked_event_id, lag_ms, updated_at.
ConflictLog: conflict_id, event_id_a, event_id_b, table_name, row_pk, resolution_strategy, winner_event_id, resolved_at.
SchemaVersion: version_id, region_id, ddl_statement, applied_at, checksum.

Hybrid logical clocks (HLC) are used to timestamp events. HLC combines physical time with a monotonic counter, ensuring that causally related events are always ordered correctly even when physical clocks drift across regions.

Active-Active Replication Architecture

Each region runs a CDC extractor that tails the database write-ahead log and publishes ChangeEvents to a regional Kafka topic. A replication agent in each peer region consumes events from all other regions and applies them to the local database.

Events are tagged with the originating region to prevent replication loops: an agent ignores events whose region tag matches the local region.
Events are applied in HLC order within each table and primary key to preserve causality.
Idempotency is enforced by recording applied event_ids in a deduplication table, checked before each apply.

Conflict Resolution

Last-Writer-Wins

For most entities, conflicts are resolved by comparing HLC timestamps. The event with the higher HLC timestamp wins and its after_image is written. The losing event is discarded and logged to the ConflictLog for audit. This strategy is safe for use cases where occasional overwrites of concurrent edits are acceptable, such as user profile updates.

Custom Merge Functions

For data types that require semantic merging — shopping carts, document text, inventory counters — the system invokes a registered merge function identified by table name:

Counter fields use an add-wins strategy: both increments are applied additively, regardless of order.
Set fields use union semantics unless explicit delete intent is recorded using a tombstone.
Structured documents can register a three-way merge function that takes the common ancestor, version A, and version B as inputs and produces a merged output.

Merge functions must be pure and deterministic. They are versioned and distributed to all regions before activation to ensure all regions apply the same logic to the same conflict.

Topology Management

The topology API manages the directed graph of replication links. Each link is independently configurable, allowing, for example, a hub-and-spoke topology for compliance requirements or a full mesh for lowest latency.

Adding a region triggers an initial bulk sync from a designated source region before the CDC stream is cut over.
Pausing a link buffers events in Kafka. On resume, the agent seeks to the last acked event_id and replays the backlog.
Removing a region marks its link as decommissioned after confirming no unconsumed events remain in the pipeline.

Lag Monitoring and Alerting

Lag is computed per link as the difference between the current wall clock and the committed_at timestamp of the oldest unacknowledged event in the pipeline. This is exported as a gauge metric with region_source and region_target labels.

Alert: lag exceeds the SLA budget for more than 60 seconds.
Alert: a region has not produced any ChangeEvents for more than 30 seconds, indicating a CDC extractor failure.
Alert: conflict rate for a table exceeds a threshold, indicating a data model or application logic issue.

API Design

GET /topology — return the full replication graph with link statuses and current lag.
POST /topology/links — create a new replication link between two regions.
PATCH /topology/links/{link_id} — pause, resume, or modify a link.
GET /conflicts — query the ConflictLog with filters on table, time range, and resolution strategy.
POST /merge-functions — register or update a merge function for a table, with a staged rollout option.