Exactly-once delivery means that every message is processed precisely once — neither lost (at-least-once) nor delivered multiple times (at-most-once). True exactly-once is difficult and expensive in distributed systems: the producer, broker, and consumer must all coordinate to prevent duplicates while also preventing message loss. Understanding the spectrum of delivery guarantees is critical for designing reliable data pipelines.
The Three Delivery Semantics
At-most-once: messages may be lost but never duplicated. Simple: fire and forget. Use when message loss is acceptable (UDP metrics, non-critical notifications). At-least-once: messages are never lost but may be duplicated. The producer retries until acknowledged; the broker delivers until the consumer commits. Duplicates require idempotent consumers. Most messaging systems default to at-least-once. Exactly-once: no loss, no duplicates. The hardest guarantee; achieved by a combination of idempotent producers, transactional APIs, and idempotent consumers.
Kafka Exactly-Once
Kafka achieves exactly-once through two mechanisms: idempotent producers and transactional APIs.
Idempotent producers: each producer is assigned a producer ID (PID). Each message sent by the producer has a monotonically increasing sequence number per partition. The broker deduplicates based on (PID, partition, sequence_number): if it receives the same sequence number twice (due to producer retry), it acknowledges without writing again. This eliminates duplicate writes from producer retries, providing exactly-once producer semantics without application changes.
Transactional API: a producer can atomically write to multiple partitions. producer.beginTransaction(); producer.send(topicA, …); producer.send(topicB, …); producer.commitTransaction(). Either all writes commit or none do. The transaction coordinator (a Kafka broker) manages the two-phase commit across partitions. Consumers configured with isolation_level=read_committed only see messages from committed transactions.
End-to-End Exactly-Once in Stream Processing
For stream processing (Kafka Streams, Flink), exactly-once requires: (1) exactly-once reads from the input topic (consumer commits offset only after processing completes), (2) exactly-once writes to the output topic (transactional producer), and (3) atomic commit of the consumer offset and the producer transaction. Kafka Streams achieves this by committing consumer offsets as part of the producer transaction — both succeed or both fail. Flink checkpoints save state and consumer offsets together; on failure, both are restored to the last checkpoint, replaying messages from that offset without duplicating output if the sink is idempotent or transactional.
Exactly-Once vs. Idempotent Processing
Exactly-once delivery at the infrastructure layer is expensive. An alternative: use at-least-once delivery with idempotent consumers. If processing a message twice produces the same result as processing it once, duplicates are harmless. Design consumers to be naturally idempotent: use UPSERT instead of INSERT (INSERT … ON CONFLICT DO NOTHING), use event_id as a deduplication key, or make processing a pure function with no side effects beyond a database write. In most practical systems, idempotent consumers + at-least-once delivery is simpler and more cost-effective than true exactly-once infrastructure.
Cost of Exactly-Once
Kafka’s exactly-once has real performance costs: transactional produce has ~2x latency vs. non-transactional (extra coordination round trips), throughput decreases because the transaction coordinator is a bottleneck, and idempotent producer sequence tracking uses memory proportional to the number of active partitions. Benchmark: Confluent reports ~20-30% throughput reduction for exactly-once vs. at-least-once at similar latency targets. Evaluate whether the use case truly requires exactly-once or whether idempotent processing at the consumer layer is sufficient — the latter is almost always cheaper.
See also: Meta Interview Guide 2026: Facebook, Instagram, WhatsApp Engineering
See also: Uber Interview Guide 2026: Dispatch Systems, Geospatial Algorithms, and Marketplace Engineering
See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering
See also: Airbnb Interview Guide 2026: Search Systems, Trust and Safety, and Full-Stack Engineering
See also: Databricks Interview Guide 2026: Spark Internals, Delta Lake, and Lakehouse Architecture
See also: Anthropic Interview Guide 2026: Process, Questions, and AI Safety
See also: Atlassian Interview Guide
See also: Coinbase Interview Guide
See also: Shopify Interview Guide
See also: Snap Interview Guide
See also: Lyft Interview Guide 2026: Rideshare Engineering, Real-Time Dispatch, and Safety Systems
See also: Stripe Interview Guide 2026: Process, Bug Bash Round, and Payment Systems