Active-Active Architecture Low-Level Design: Multi-Region Writes, Conflict Resolution, and Global Load Balancing

⏱ 9 min read

What Is Active-Active Architecture?

In an active-active architecture, multiple regions (or data centers) simultaneously accept writes and serve reads. Every region is fully operational at all times — there is no standby waiting to take over. This provides the highest possible availability and the lowest latency globally, because requests are routed to the nearest region without a failover step. The fundamental challenge is that concurrent writes to different regions targeting the same data can conflict, and those conflicts must be detected and resolved.

Write Coordination

Each region writes locally to its own primary database or data store. Changes are replicated asynchronously to all other regions in the background. Asynchronous replication is essential for keeping write latency low — synchronous multi-region writes would add a full cross-region round-trip to every write. The tradeoff is a replication lag window during which two regions can accept conflicting writes to the same key.

Conflict Types

Write-write conflict: the same key is updated in two different regions simultaneously. Both writes succeed locally but the resulting values diverge until replication and conflict resolution run.
Delete-update conflict: one region deletes a record while another region concurrently updates it. The outcome depends on the resolution strategy — delete wins, update wins, or application-specific merge logic applies.
Insert-insert conflict: two regions insert a record with the same primary key. Common with auto-increment integers; avoided by using globally unique IDs (UUID, Snowflake).

Conflict Resolution Strategies

Last-Write-Wins (LWW)

Each write is tagged with a timestamp. When two conflicting writes arrive, the one with the higher timestamp wins. Simple to implement but requires tightly synchronized clocks across regions. NTP drift or leap seconds can cause incorrect resolution. Hybrid logical clocks (HLC) improve accuracy by combining physical time with a logical counter.

Application-Level Merge

The application defines a merge function for each data type. Counters use addition (commutative); sets use union; structured records use field-level merge with business rules. CRDTs (conflict-free replicated data types) formalize this by defining data structures whose merge operations are commutative, associative, and idempotent — guaranteeing convergence without coordination.

User-Home-Region Routing

Each user is assigned a home region. All writes for that user are routed to their home region, eliminating write conflicts for per-user data. Reads are served from any region (eventually consistent) or from the home region (strongly consistent). This is the most practical conflict-avoidance strategy for user-centric applications. Fallback to any region on home unavailability requires accepting a temporary conflict window.

GeoDNS and Global Load Balancing

GeoDNS resolves a domain name to different IP addresses based on the geographic location of the DNS resolver. A user in Tokyo resolves to the Tokyo regional endpoint; a user in Frankfurt resolves to the EU endpoint. This minimizes round-trip latency for the initial connection.

Anycast assigns the same IP address to endpoints in multiple regions. BGP routing automatically directs packets to the topologically closest instance. Anycast is used by CDNs and DNS providers (Cloudflare, AWS Route 53) for sub-millisecond global routing decisions. A global load balancer (AWS Global Accelerator, GCP Cloud Load Balancing) adds health-aware routing: if a region is unhealthy, traffic is automatically routed to the next-best region without a DNS TTL wait.

Data Residency and Compliance

GDPR and similar regulations require that certain data (EU personal data) not leave a specific geographic boundary. In an active-active system, this is enforced by per-entity region routing: a record tagged as EU-resident is only written to and replicated within EU regions. Non-EU regions are explicitly excluded from the replication topology for that data class. This requires the routing layer to be aware of data residency tags before accepting a write.

SQL Schema

CREATE TABLE RegionalWrite (
    id              BIGSERIAL PRIMARY KEY,
    region_id       VARCHAR(64)  NOT NULL,
    key             VARCHAR(256) NOT NULL,
    value           JSONB        NOT NULL,
    write_timestamp TIMESTAMPTZ  NOT NULL DEFAULT now(),
    vector_clock    JSONB        NOT NULL DEFAULT '{}'
);

CREATE TABLE ConflictRecord (
    id             BIGSERIAL    PRIMARY KEY,
    key            VARCHAR(256) NOT NULL,
    region_a       VARCHAR(64)  NOT NULL,
    region_b       VARCHAR(64)  NOT NULL,
    a_value        JSONB        NOT NULL,
    b_value        JSONB        NOT NULL,
    resolved_value JSONB,
    strategy_used  VARCHAR(64),
    resolved_at    TIMESTAMPTZ
);

CREATE INDEX idx_rw_key_ts     ON RegionalWrite  (key, write_timestamp DESC);
CREATE INDEX idx_cr_key        ON ConflictRecord (key, resolved_at DESC);

Python Implementation

import time
import threading
from typing import Any, Dict, List, Optional

# ── Regional write store (one per region in practice) ────────────────────────

_stores: Dict[str, Dict[str, Any]] = {}
_clocks: Dict[str, Dict[str, int]] = {}
_lock = threading.Lock()

def write_local(region: str, key: str, value: Any) -> Dict:
    with _lock:
        if region not in _stores:
            _stores[region] = {}
            _clocks[region] = {}
        ts = time.time()
        vc = dict(_clocks.get(region, {}))
        vc[region] = vc.get(region, 0) + 1
        _clocks[region] = vc
        _stores[region][key] = {'value': value, 'ts': ts, 'vc': vc}
        return {'region': region, 'key': key, 'value': value, 'ts': ts, 'vc': vc}


def replicate_async(change: Dict, target_regions: List[str]):
    """
    Push a change to target regions asynchronously.
    In production this is a message queue (Kafka, Kinesis) + consumer.
    """
    def _apply(region, change):
        time.sleep(0.01)   # simulate network latency
        with _lock:
            if region not in _stores:
                _stores[region] = {}
            existing = _stores[region].get(change['key'])
            if existing:
                resolved = resolve_conflict(change['key'], existing, change, 'lww')
                _stores[region][change['key']] = resolved
            else:
                _stores[region][change['key']] = change

    for region in target_regions:
        t = threading.Thread(target=_apply, args=(region, change), daemon=True)
        t.start()


def resolve_conflict(key: str, version_a: Dict, version_b: Dict,
                     strategy: str = 'lww') -> Dict:
    """
    Resolve a write-write conflict between two versions of a key.
    Returns the winning version.
    """
    if strategy == 'lww':
        return version_a if version_a['ts'] >= version_b['ts'] else version_b
    if strategy == 'union' and isinstance(version_a['value'], list):
        merged = list(set(version_a['value']) | set(version_b['value']))
        return {**version_a, 'value': merged}
    # Default: higher vector clock sum wins (approximate)
    sum_a = sum(version_a.get('vc', {}).values())
    sum_b = sum(version_b.get('vc', {}).values())
    return version_a if sum_a >= sum_b else version_b


def route_write(user_id: int, key: str, value: Any,
                home_region_map: Dict[int, str]) -> Dict:
    """Route a write to the user's home region to minimize conflicts."""
    region = home_region_map.get(user_id, 'us-east-1')
    change = write_local(region, key, value)
    other_regions = [r for r in _stores if r != region]
    replicate_async(change, other_regions)
    return change

Conflict Frequency and Mitigation

In practice, write-write conflicts are rare when user-home-region routing is applied — the vast majority of writes are isolated to a single region by design. Conflicts arise mainly at region boundaries (cross-user shared data, global configuration) or during region failover when traffic is temporarily rerouted to a non-home region. Conflict detection pipelines should track conflict rate as an operational metric; a sustained rise indicates either a routing misconfiguration or a hotspot that needs application-level sharding.

Frequently Asked Questions

What is the difference between active-active and active-passive?

In active-active, all regions simultaneously accept writes and serve reads. There is no failover — every region is always live. In active-passive, one region (the primary) handles all traffic; passive standbys replicate but do not serve traffic. Active-passive eliminates conflict resolution complexity but requires a failover step (seconds to minutes) on primary failure and wastes standby capacity. Active-active provides higher availability and lower global latency at the cost of conflict handling complexity.

How frequent are write-write conflicts in active-active systems?

With user-home-region routing, conflicts are rare — typically less than 0.1% of writes. Conflicts become frequent only on shared mutable data accessed from multiple regions simultaneously (global counters, shared sessions, configuration records). These hotspots should be identified and handled with CRDTs or a single-region coordination pattern rather than multi-region active-active writes.

Why are CRDTs useful for active-active conflict resolution?

CRDTs (conflict-free replicated data types) have mathematically proven merge operations that are commutative, associative, and idempotent. Concurrent updates from any number of regions always converge to the same final state when merged, without any coordination or conflict detection step. This makes them ideal for counters, sets, and flags in active-active systems where coordination overhead must be minimized.

How quickly does GeoDNS fail over to a backup region?

GeoDNS failover speed is bounded by the DNS TTL of the record — typically 60 to 300 seconds. DNS caches across the internet must expire and re-resolve before all clients see the new region. Anycast-based routing (BGP withdrawal) converges faster, typically in 30 to 90 seconds depending on BGP propagation. Application-level global load balancers (AWS Global Accelerator) use a persistent TCP/UDP connection and can reroute within seconds by updating the endpoint association without waiting for DNS expiry.