Q: What is CDC (Change Data Capture) and how does it enable cross-region replication?

CDC captures every database change (INSERT, UPDATE, DELETE) as a stream of events in real time. Tools: Debezium (open source, supports PostgreSQL, MySQL, MongoDB), AWS DMS, Google Datastream. Debezium reads the database write-ahead log (WAL) and publishes change events to Kafka. For cross-region replication: Kafka MirrorMaker 2 replicates topics from the primary region Kafka cluster to the secondary region cluster. The secondary region consumes the change stream and applies it to its read replica. Lag: typically < 1 second. Benefits over logical replication: (1) Kafka provides durable storage and replayability — if the secondary region is down, it catches up when it recovers. (2) Multiple consumers can use the same change stream for different purposes (replication, audit, cache invalidation). (3) Decoupled — primary DB is not directly connected to secondary DB.

Q: How does global load balancing route traffic to the nearest healthy region?

Three approaches: (1) GeoDNS: the DNS resolver returns different IP addresses based on the client's geographic location. Simple to implement; failover speed limited by DNS TTL (60-300 seconds). (2) Anycast: the same IP address is announced from multiple PoPs (points of presence) via BGP. Network routing automatically directs clients to the topologically nearest PoP. Used by Cloudflare and CDNs for <20ms routing decisions. No DNS TTL delay on failover. (3) AWS Global Accelerator / GCP Global LB: managed global load balancer with health checks. Routes TCP/UDP traffic to the nearest healthy region. Health check failure detected in 10-30 seconds; failover in 30-60 seconds. Recommendation: use Global Accelerator or GCP Global LB for API traffic (fastest failover, no DNS dependency), and GeoDNS + CDN for static assets.

Q: What are RTO and RPO, and what values are achievable with multi-region?

RTO (Recovery Time Objective): maximum acceptable time from failure to restored service. How long can you be down? RPO (Recovery Point Objective): maximum acceptable data loss measured in time. How much data can you afford to lose? With active-passive multi-region: RTO = 1-5 minutes (time to detect failure, promote replica, and update routing). RPO = replication lag at time of failure, typically < 1 second with synchronous replication to a standby, or < 30 seconds with asynchronous replication. With active-active: RTO = seconds (traffic already flowing to all regions; failed region just stops receiving new traffic). RPO = near-zero for writes that succeeded in any region before failure. Without multi-region (single AZ failover): RTO = 1-10 minutes, RPO = seconds. Without any HA: RTO = hours (manual restore from backup), RPO = hours (last backup).

Q: How do you handle conflict resolution in an active-active multi-region database?

Three strategies: (1) Last-Write-Wins (LWW): compare timestamps; the write with the higher timestamp wins. Problem: clock skew between servers — clocks can differ by seconds. Solution: use hybrid logical clocks (HLC) that combine physical time with a logical counter, ensuring monotonically increasing timestamps across nodes. (2) CRDTs (Conflict-free Replicated Data Types): data structures that merge without conflicts by design. G-Counter (grow-only counter): each node tracks its own count; merge by taking the max per node. PN-Counter: pair of G-Counters for increment and decrement. LWW-Register: stores (value, timestamp). CRDTs work for counters, sets, and registers but not arbitrary relational data. (3) Application-level resolution: detect conflicts (two writes to the same record within a time window), store both versions, resolve on next read. Used by Amazon's Dynamo for shopping carts. Operationally simplest but requires application logic for each entity type.

Question 1

What is the difference between active-passive and active-active multi-region architectures?

Accepted Answer

Active-passive: one primary region handles all write traffic; secondary regions serve reads only and are promoted to primary on failure. Simpler — no conflict resolution needed because there is only one write source. Failover takes 1-5 minutes (health check detection + DNS propagation + replica promotion). RPO (data loss): typically < 1 second of replication lag. Active-active: all regions accept both reads and writes simultaneously. Lowest read and write latency globally — users always write to a nearby region. Requires conflict resolution for concurrent writes to the same record from different regions (last-write-wins, CRDTs, or application merge). Used by DynamoDB Global Tables, Cassandra, CockroachDB. Choose active-passive unless you specifically need write latency below 100ms from multiple geographies or need to survive writes during a primary region outage.

Question 2

What is CDC (Change Data Capture) and how does it enable cross-region replication?

Accepted Answer

CDC captures every database change (INSERT, UPDATE, DELETE) as a stream of events in real time. Tools: Debezium (open source, supports PostgreSQL, MySQL, MongoDB), AWS DMS, Google Datastream. Debezium reads the database write-ahead log (WAL) and publishes change events to Kafka. For cross-region replication: Kafka MirrorMaker 2 replicates topics from the primary region Kafka cluster to the secondary region cluster. The secondary region consumes the change stream and applies it to its read replica. Lag: typically < 1 second. Benefits over logical replication: (1) Kafka provides durable storage and replayability — if the secondary region is down, it catches up when it recovers. (2) Multiple consumers can use the same change stream for different purposes (replication, audit, cache invalidation). (3) Decoupled — primary DB is not directly connected to secondary DB.

Question 3

How does global load balancing route traffic to the nearest healthy region?

Accepted Answer

Three approaches: (1) GeoDNS: the DNS resolver returns different IP addresses based on the client's geographic location. Simple to implement; failover speed limited by DNS TTL (60-300 seconds). (2) Anycast: the same IP address is announced from multiple PoPs (points of presence) via BGP. Network routing automatically directs clients to the topologically nearest PoP. Used by Cloudflare and CDNs for <20ms routing decisions. No DNS TTL delay on failover. (3) AWS Global Accelerator / GCP Global LB: managed global load balancer with health checks. Routes TCP/UDP traffic to the nearest healthy region. Health check failure detected in 10-30 seconds; failover in 30-60 seconds. Recommendation: use Global Accelerator or GCP Global LB for API traffic (fastest failover, no DNS dependency), and GeoDNS + CDN for static assets.

Question 4

What are RTO and RPO, and what values are achievable with multi-region?

Accepted Answer

RTO (Recovery Time Objective): maximum acceptable time from failure to restored service. How long can you be down? RPO (Recovery Point Objective): maximum acceptable data loss measured in time. How much data can you afford to lose? With active-passive multi-region: RTO = 1-5 minutes (time to detect failure, promote replica, and update routing). RPO = replication lag at time of failure, typically < 1 second with synchronous replication to a standby, or < 30 seconds with asynchronous replication. With active-active: RTO = seconds (traffic already flowing to all regions; failed region just stops receiving new traffic). RPO = near-zero for writes that succeeded in any region before failure. Without multi-region (single AZ failover): RTO = 1-10 minutes, RPO = seconds. Without any HA: RTO = hours (manual restore from backup), RPO = hours (last backup).

Question 5

How do you handle conflict resolution in an active-active multi-region database?

Accepted Answer

Three strategies: (1) Last-Write-Wins (LWW): compare timestamps; the write with the higher timestamp wins. Problem: clock skew between servers — clocks can differ by seconds. Solution: use hybrid logical clocks (HLC) that combine physical time with a logical counter, ensuring monotonically increasing timestamps across nodes. (2) CRDTs (Conflict-free Replicated Data Types): data structures that merge without conflicts by design. G-Counter (grow-only counter): each node tracks its own count; merge by taking the max per node. PN-Counter: pair of G-Counters for increment and decrement. LWW-Register: stores (value, timestamp). CRDTs work for counters, sets, and registers but not arbitrary relational data. (3) Application-level resolution: detect conflicts (two writes to the same record within a time window), store both versions, resolve on next read. Used by Amazon's Dynamo for shopping carts. Operationally simplest but requires application logic for each entity type.

Multi-Region Architecture Low-Level Design

Why Multi-Region?

Architecture Patterns

Data Replication

Global Load Balancing

Failover Procedure

Conflict Resolution for Active-Active

Key Design Decisions