Question 1

What does the CAP theorem state and what are its limitations?

Accepted Answer

CAP theorem: a distributed system can guarantee at most two of Consistency (every read returns the most recent write), Availability (every request gets a response), and Partition Tolerance (system works despite network partitions). Since partitions are unavoidable in real networks, the real choice is CP (consistent under partition, may be unavailable) or AP (available under partition, may be stale). Limitation: CAP is binary and does not capture degrees of consistency or the normal-operation trade-off between latency and consistency (addressed by PACELC).

Question 2

What is the difference between strong consistency and eventual consistency?

Accepted Answer

Strong consistency: after a write completes, all subsequent reads from any node return the new value. Requires synchronous replication or consensus (Raft/Paxos). Higher latency, lower availability. Used in: banking, seat booking, configuration stores. Eventual consistency: if no new writes occur, all replicas will converge to the same value - but reads may see stale data temporarily. Higher availability, lower latency. Used in: DNS, social media likes, shopping carts, search indexes. Most systems choose eventual consistency and add specific strong-consistency guarantees where required.

Question 3

What are the different replication strategies and their trade-offs?

Accepted Answer

Single-leader: all writes go to one leader, replicated to followers. Simple, consistent reads from leader, but leader is a bottleneck and single point of failure. Multi-leader: multiple nodes accept writes, useful for multi-datacenter. Requires conflict resolution (last-write-wins, CRDTs, application-level merge). Leaderless (Dynamo): any node accepts writes, uses quorum (W+R>N for consistency). No failover needed but requires read repair and anti-entropy. Most relational databases use single-leader; Cassandra and DynamoDB use leaderless.

Question 4

What is the Raft consensus algorithm and what is it used for?

Accepted Answer

Raft is a consensus algorithm designed for understandability (as an alternative to Paxos). It elects a leader who manages a replicated log. Leader receives client requests, appends to log, replicates to followers, commits when a majority acknowledge. On leader failure, followers elect a new leader. Used for: Etcd (Kubernetes configuration store), CockroachDB, TiKV, Consul, InfluxDB. Essential for: distributed locks, leader election, replicated configuration, any system requiring strong consistency across nodes.

Question 5

What is quorum in distributed systems and why does W+R > N ensure consistency?

Accepted Answer

In leaderless replication with N replicas, a write succeeds when W replicas acknowledge it and a read queries R replicas. If W + R > N, there is guaranteed overlap: at least one node in the read quorum has the latest write. Example: N=3, W=2, R=2 - write acks 2 nodes, read queries 2 nodes, they must share at least 1 node with the latest data. Lower W or R improves throughput but risks stale reads. W=1, R=N gives strong read consistency with fast writes. Cassandra uses tunable consistency: ANY, ONE, QUORUM, ALL per operation.

System Design Fundamentals: CAP Theorem, Consistency, and Replication

System Design Fundamentals: CAP Theorem, Consistency, and Availability

CAP Theorem

PACELC Model (Extension of CAP)

Consistency Models

Strong Consistency

Eventual Consistency

Causal Consistency

Read Your Own Writes

Replication Strategies

Single Leader (Master-Replica)

Multi-Leader

Leaderless (Dynamo-style)

ACID vs BASE

ACID (traditional databases)

BASE (NoSQL systems)

Consensus Algorithms

Interview Application

Key Terms to Know