Raft is a distributed consensus algorithm designed to be understandable — a deliberate contrast to Paxos, which is notoriously difficult to reason about and implement correctly. Raft achieves distributed consensus by electing a leader that manages all writes, replicating entries to followers, and committing once a majority has confirmed. Systems like etcd, CockroachDB, TiKV, and Consul use Raft as their consensus layer.
Three Subproblems
Raft decomposes consensus into three relatively independent subproblems: leader election (ensure exactly one leader at a time), log replication (the leader receives entries and replicates them to followers), and safety (ensure that all committed entries are durable across leader changes). This decomposition is what makes Raft understandable — each subproblem has a clear solution.
Leader Election
Time is divided into terms (monotonically increasing integers). Each term begins with a leader election. A follower that hasn’t received a heartbeat from the leader within the election_timeout (150-300ms) becomes a candidate: it increments its current term, votes for itself, and sends RequestVote RPCs to all other nodes. A node grants its vote if: the candidate’s term >= its current term, and it hasn’t voted in this term yet, and the candidate’s log is at least as up-to-date as its own (prevents stale leaders). The first candidate to collect votes from a majority wins the election and becomes leader. Split votes (multiple candidates) are resolved by randomized timeouts — nodes wait a random duration before starting a new election.
Log Replication
The leader is the sole source of log entries. Clients send all writes to the leader. The leader appends the entry to its log and sends AppendEntries RPCs to all followers. Followers append the entry to their local logs and acknowledge. Once a majority (including the leader) have appended the entry, the leader marks it committed and applies it to the state machine. Committed entries are safe — they will not be lost even if the leader crashes. The leader sends committed index in subsequent AppendEntries (or heartbeats), and followers apply entries up to the committed index to their state machines.
Safety: Log Matching Property
Raft’s safety guarantee: if a log entry is committed in a given term, it will be present in the logs of all future leaders. This is enforced by the election rule: a candidate can only win an election if its log is at least as up-to-date as a majority of nodes. “Up-to-date” is determined by comparing (last log term, last log index) — higher term wins; equal term, higher index wins. This ensures that any new leader has all committed entries — preventing committed entries from being lost during leader transitions.
Leader Failover
When a leader fails, followers don’t receive heartbeats and start an election after election_timeout. A new leader is elected with all previously committed entries. The new leader may have entries that the old leader had replicated to some followers but not committed — these will be overwritten if they conflict with the new leader’s log (they were not committed, so they were not safe). This is correct: uncommitted entries are not applied to the state machine, so overwriting them causes no data loss.
Performance Characteristics
Raft is designed for strong consistency, not high throughput. All writes go through a single leader — throughput is bounded by the leader’s capacity. Read performance can be improved with: (1) leader reads (strongly consistent, no extra messages), (2) follower reads with lease (leader leases reads to followers for a bounded time), (3) ReadIndex (leader confirms it is still leader via a heartbeat before serving the read). Raft latency is determined by the majority replication round trip — typically 1-10ms in same-datacenter deployments. Across datacenters (Raft consensus), latency is dominated by network RTT between sites.
See also: Meta Interview Guide 2026: Facebook, Instagram, WhatsApp Engineering
See also: Uber Interview Guide 2026: Dispatch Systems, Geospatial Algorithms, and Marketplace Engineering
See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering
See also: Airbnb Interview Guide 2026: Search Systems, Trust and Safety, and Full-Stack Engineering
See also: Databricks Interview Guide 2026: Spark Internals, Delta Lake, and Lakehouse Architecture
See also: Anthropic Interview Guide 2026: Process, Questions, and AI Safety
See also: Atlassian Interview Guide
See also: Coinbase Interview Guide
See also: Shopify Interview Guide
See also: Snap Interview Guide
See also: Lyft Interview Guide 2026: Rideshare Engineering, Real-Time Dispatch, and Safety Systems
See also: Stripe Interview Guide 2026: Process, Bug Bash Round, and Payment Systems