System Design Fundamentals: CAP Theorem, Consistency, and Replication

System Design Fundamentals: CAP Theorem, Consistency, and Availability

Understanding distributed systems fundamentals is essential for any system design interview. These concepts explain the trade-offs behind every architectural decision. Expected knowledge at all FAANG/MANGA companies and any role involving distributed systems.

CAP Theorem

In a distributed system, you can only guarantee two of three properties simultaneously:

  • Consistency (C): every read receives the most recent write or an error. All nodes see the same data at the same time.
  • Availability (A): every request receives a response (not an error), but the response might not be the most recent data.
  • Partition Tolerance (P): the system continues operating even when network messages are dropped between nodes.

Since network partitions are a reality in distributed systems, the real choice is between C and A during a partition:

  • CP systems: HBase, Zookeeper, MongoDB (with majority write concern), Etcd. Return error when partition occurs rather than stale data.
  • AP systems: Cassandra, CouchDB, DynamoDB, Riak. Return potentially stale data rather than error.

PACELC Model (Extension of CAP)

CAP only considers partition scenarios. PACELC also considers the normal case:

  • If Partition: choose between Availability vs Consistency (like CAP)
  • Else (normal operation): choose between Latency vs Consistency

DynamoDB: PA/EL – available on partition, low latency normally (eventual consistency). Spanner: PC/EC – consistent on partition, consistent normally (higher latency, uses TrueTime).

Consistency Models

Strong Consistency

After a write completes, all subsequent reads return the new value. Like a single-node database. Expensive: requires synchronous replication or consensus (Raft/Paxos). Used in: financial systems, configuration stores (Etcd/Zookeeper).

Eventual Consistency

If no new writes occur, all replicas will eventually converge to the same value. Reads may see stale data in the meantime. High availability and low latency. Used in: DNS, email, social media counters, shopping carts (Amazon Dynamo paper).

Causal Consistency

Operations that are causally related are seen in the same order by all nodes. If you write A then write B (B depends on A), all nodes see A before B. MongoDB 4.0+ multi-document sessions, Facebook Cassandra causal reads.

Read Your Own Writes

After a write, the same client always reads the new value. Others may still see old value. Implemented by: routing reads to the primary, using session affinity, or version vectors. Important for user experience (post a comment, immediately see it).

Replication Strategies

Single Leader (Master-Replica)

All writes go to leader, replicated to followers. Reads from followers (eventual consistency) or leader (strong consistency). Failover: promote a follower. Used by: PostgreSQL, MySQL, MongoDB. Simple but leader is a bottleneck.

Multi-Leader

Multiple nodes accept writes. Useful for multi-datacenter deployments (each DC has a leader). Conflict resolution needed (last-write-wins, vector clocks, CRDTs). Used by: CouchDB, Google Docs (CRDT).

Leaderless (Dynamo-style)

Any node accepts reads and writes. Quorum: write to W nodes, read from R nodes. If W + R > N (replication factor), reads see at least one up-to-date write. DynamoDB, Cassandra, Riak use this model.

N = replication factor (e.g., 3)
W = write quorum (e.g., 2)
R = read quorum (e.g., 2)
W + R > N ensures strong consistency
W = 1, R = 1: maximum availability, eventual consistency

ACID vs BASE

ACID (traditional databases)

  • Atomicity: all-or-nothing transactions
  • Consistency: data always satisfies integrity constraints
  • Isolation: concurrent transactions do not interfere
  • Durability: committed data survives failures

BASE (NoSQL systems)

  • Basically Available: system is available even under failure
  • Soft state: state may change without input (replication catching up)
  • Eventual consistency: will become consistent given time

Consensus Algorithms

  • Paxos: theoretical foundation. Hard to implement correctly. Used in Google Chubby, Apache Zookeeper (ZAB variant).
  • Raft: designed for understandability. Leader election + log replication. Used in: Etcd, CockroachDB, TiKV, Consul. Most common in new systems.

Consensus used for: leader election, distributed locks, configuration stores, replicated state machines.

Interview Application

When asked about a system design, state the consistency requirement upfront:

  • Financial transactions, seat booking: strong consistency required
  • Social media likes, view counts, shopping carts: eventual consistency acceptable
  • User settings, profile reads: read-your-own-writes minimum
  • Leaderboard rankings: eventual consistency with bounded staleness (seconds)

Key Terms to Know

  • Write-ahead log (WAL): durability mechanism, changes written to log before applying
  • Hinted handoff: Dynamo-style stores buffer writes for temporarily unavailable nodes
  • Read repair: detect and fix stale replicas during read operations
  • Anti-entropy: background process (Merkle tree comparison) to sync replicas
  • Vector clock / Version vector: track causality for conflict detection in multi-leader systems


Companies that ask this: Cloudflare Interview Guide 2026: Networking, Edge Computing, and CDN Design

Companies that ask this: Netflix Interview Guide 2026: Streaming Architecture, Recommendation Systems, and Engineering Excellence

Companies that ask this: Uber Interview Guide 2026: Dispatch Systems, Geospatial Algorithms, and Marketplace Engineering

Companies that ask this: Coinbase Interview Guide

Companies that ask this: Stripe Interview Guide 2026: Process, Bug Bash Round, and Payment Systems

Companies that ask this: Databricks Interview Guide 2026: Spark Internals, Delta Lake, and Lakehouse Architecture

{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “What does the CAP theorem state and what are its limitations?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “CAP theorem: a distributed system can guarantee at most two of Consistency (every read returns the most recent write), Availability (every request gets a response), and Partition Tolerance (system works despite network partitions). Since partitions are unavoidable in real networks, the real choice is CP (consistent under partition, may be unavailable) or AP (available under partition, may be stale). Limitation: CAP is binary and does not capture degrees of consistency or the normal-operation trade-off between latency and consistency (addressed by PACELC).”
}
},
{
“@type”: “Question”,
“name”: “What is the difference between strong consistency and eventual consistency?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Strong consistency: after a write completes, all subsequent reads from any node return the new value. Requires synchronous replication or consensus (Raft/Paxos). Higher latency, lower availability. Used in: banking, seat booking, configuration stores. Eventual consistency: if no new writes occur, all replicas will converge to the same value – but reads may see stale data temporarily. Higher availability, lower latency. Used in: DNS, social media likes, shopping carts, search indexes. Most systems choose eventual consistency and add specific strong-consistency guarantees where required.”
}
},
{
“@type”: “Question”,
“name”: “What are the different replication strategies and their trade-offs?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Single-leader: all writes go to one leader, replicated to followers. Simple, consistent reads from leader, but leader is a bottleneck and single point of failure. Multi-leader: multiple nodes accept writes, useful for multi-datacenter. Requires conflict resolution (last-write-wins, CRDTs, application-level merge). Leaderless (Dynamo): any node accepts writes, uses quorum (W+R>N for consistency). No failover needed but requires read repair and anti-entropy. Most relational databases use single-leader; Cassandra and DynamoDB use leaderless.”
}
},
{
“@type”: “Question”,
“name”: “What is the Raft consensus algorithm and what is it used for?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Raft is a consensus algorithm designed for understandability (as an alternative to Paxos). It elects a leader who manages a replicated log. Leader receives client requests, appends to log, replicates to followers, commits when a majority acknowledge. On leader failure, followers elect a new leader. Used for: Etcd (Kubernetes configuration store), CockroachDB, TiKV, Consul, InfluxDB. Essential for: distributed locks, leader election, replicated configuration, any system requiring strong consistency across nodes.”
}
},
{
“@type”: “Question”,
“name”: “What is quorum in distributed systems and why does W+R > N ensure consistency?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “In leaderless replication with N replicas, a write succeeds when W replicas acknowledge it and a read queries R replicas. If W + R > N, there is guaranteed overlap: at least one node in the read quorum has the latest write. Example: N=3, W=2, R=2 – write acks 2 nodes, read queries 2 nodes, they must share at least 1 node with the latest data. Lower W or R improves throughput but risks stale reads. W=1, R=N gives strong read consistency with fast writes. Cassandra uses tunable consistency: ANY, ONE, QUORUM, ALL per operation.”
}
}
]
}

Scroll to Top