Why Multi-Region?
Single-region deployments fail entirely during a cloud provider AZ or region outage. Multi-region provides: (1) High availability — survive a full region outage without downtime. (2) Latency — serve users from geographically close regions (50ms vs 200ms). (3) Compliance — data residency requirements (GDPR: EU data must stay in EU). (4) Disaster recovery — RTO (Recovery Time Objective) of minutes, not hours.
Architecture Patterns
Active-Passive: one primary region handles all writes; standby regions serve reads and can be promoted on failure. Simpler consistency (single write source). Failover takes 1-5 minutes (DNS propagation + health check). Writes have single-region latency.
Active-Active: all regions accept reads and writes simultaneously. Lowest latency for users globally. Requires distributed consensus or eventual consistency — concurrent writes to different regions for the same record must be resolved (last-write-wins, vector clocks, or CRDT). Used by DynamoDB Global Tables, Cassandra, CockroachDB.
Active-Active with region affinity: users are pinned to a primary region by their user_id hash. Writes from user always go to their home region. Cross-region writes happen only for shared resources. Avoids most conflict scenarios while still serving reads globally.
Data Replication
Primary Region (us-east-1):
PostgreSQL Primary → Kafka (CDC via Debezium)
→ Kafka MirrorMaker 2 → Secondary Region Kafka
→ PostgreSQL Read Replica (eu-west-1)
Secondary Region (eu-west-1):
Read Replica: serves reads with ~100ms replication lag
Promoted to primary on failover
CDC (Change Data Capture) via Debezium captures every DB change as a Kafka event. Replication lag is typically <1 second for writes of normal volume. Monitoring: track replication_lag metric; alert if > 30 seconds.
Global Load Balancing
Route traffic to the nearest healthy region: (1) GeoDNS: DNS returns different IPs based on the client’s geographic location. TTL=60s — failover takes up to 60s. (2) Anycast routing: announce the same IP from multiple regions; BGP routes to the nearest. Used by Cloudflare. (3) Global load balancer (AWS Global Accelerator, GCP Global LB): layer-4/7 routing with health checks; failover in <30 seconds. Production recommendation: use a global LB with health checks for API traffic; GeoDNS for static assets via CDN.
Failover Procedure
- Health checks detect primary region failure (3 consecutive failures over 30s)
- Global LB stops routing new traffic to the failed region
- Promote read replica in secondary region to primary (pg_promote() in PostgreSQL)
- Update application config to point writes to new primary
- Verify replication lag was < acceptable threshold before promotion (check last_wal_receive_lsn)
- Alert on-call team; begin RCA
RTO: 1-5 minutes. RPO (Recovery Point Objective): data up to the replication lag (typically <1s, worst case 30s if monitoring threshold).
Conflict Resolution for Active-Active
When two regions accept writes for the same record concurrently: (1) Last-Write-Wins (LWW): compare timestamps; most recent update wins. Risk: clock skew between regions can cause earlier writes to incorrectly win. Use hybrid logical clocks (HLC) instead of wall clocks. (2) Application-level merging: for counters, use CRDTs (Conflict-free Replicated Data Types) — a CRDT counter merges by taking the max of each node’s value. (3) Conflict detection + manual resolution: detect conflicts (same record modified in two regions during the same window), store both versions, application resolves on next read. Used by Amazon’s shopping cart (Dynamo paper).
Key Design Decisions
- Active-passive for most services — simpler consistency, acceptable 1-5min failover
- Active-active only for user-specific data where region affinity eliminates most conflicts
- CDC via Kafka for cross-region replication — replayable, auditable, decoupled
- Global LB with health checks — faster failover than GeoDNS alone
- Monitor replication lag continuously — stale replica is worse than no replica
Atlassian system design covers distributed and multi-region architecture. See common questions for Atlassian interview: multi-region architecture and high availability design.
Amazon system design covers global infrastructure and multi-region architecture. Review patterns for Amazon interview: multi-region and global infrastructure design.
Databricks system design covers multi-region data replication. See design patterns for Databricks interview: multi-region data replication and consistency.