Event Deduplication System — Low-Level Design
An event deduplication system ensures that duplicate events from at-least-once delivery queues are processed exactly once. It is critical for payment processing, inventory updates, and any operation that must not be applied twice. This design applies to Kafka, SQS, and any message queue used at Amazon, Stripe, and Uber.
The Problem: At-Least-Once Delivery
Most message queues guarantee at-least-once delivery:
- Kafka: a consumer crash before committing offset replays messages
- SQS: if a consumer doesn't delete the message within visibility timeout, it redelivers
- Webhooks: provider retries on timeout or non-2xx response
Result: consumer code may receive the same event_id multiple times.
Any non-idempotent operation (increment counter, debit account, send email)
applied twice causes bugs.
Core Data Model
ProcessedEvent
event_id TEXT NOT NULL -- event's unique identifier
event_type TEXT NOT NULL -- 'order.created', 'payment.charged'
processor_id TEXT NOT NULL -- which consumer group processed it
processed_at TIMESTAMPTZ NOT NULL
result JSONB -- optional: store result for replay
PRIMARY KEY (event_id, processor_id)
-- Partition by month to enable fast cleanup of old records
CREATE TABLE processed_events_2024_01
PARTITION OF ProcessedEvent
FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');
Basic Deduplication with INSERT ON CONFLICT
def process_event_idempotent(event):
# Attempt to mark the event as processed
rows_inserted = db.execute("""
INSERT INTO ProcessedEvent (event_id, event_type, processor_id, processed_at)
VALUES (%(eid)s, %(etype)s, %(pid)s, NOW())
ON CONFLICT (event_id, processor_id) DO NOTHING
""", {'eid': event.id, 'etype': event.type, 'pid': PROCESSOR_ID}).rowcount
if rows_inserted == 0:
# Already processed — skip
log.info(f'Duplicate event {event.id}, skipping')
return
# Process exactly once
handle_event(event)
Deduplication with Result Caching
# Some consumers need to return the same result on duplicate delivery
# (e.g., payment charge: second call should return the original charge ID)
def process_with_result(event):
# Check if already processed and return cached result
existing = db.execute("""
SELECT result FROM ProcessedEvent
WHERE event_id=%(eid)s AND processor_id=%(pid)s
""", {'eid': event.id, 'pid': PROCESSOR_ID}).first()
if existing:
return existing.result # Replay the original result
# Execute the operation
result = handle_event(event)
# Store result atomically
db.execute("""
INSERT INTO ProcessedEvent (event_id, event_type, processor_id, processed_at, result)
VALUES (%(eid)s, %(etype)s, %(pid)s, NOW(), %(result)s)
ON CONFLICT (event_id, processor_id) DO NOTHING
""", {'eid': event.id, 'etype': event.type, 'pid': PROCESSOR_ID,
'result': json.dumps(result)})
return result
Redis-Based Deduplication (High Throughput)
# For very high event rates, DB writes per event are expensive
# Redis SET NX (set if not exists) is atomic and ~10x faster
def deduplicate_with_redis(event_id, processor_id, ttl_seconds=86400):
"""
Returns True if this is the first time we've seen this event.
Returns False if it's a duplicate.
"""
key = f'processed:{processor_id}:{event_id}'
# SET key value NX EX ttl — atomic, returns True on first set
is_new = redis.set(key, '1', nx=True, ex=ttl_seconds)
return is_new is not None # None = key already existed
def process_event(event):
if not deduplicate_with_redis(event.id, PROCESSOR_ID):
return # Duplicate
handle_event(event)
# For critical events: also write to ProcessedEvent DB table for durability
# Redis key may be evicted under memory pressure
Window-Based Deduplication
# For events with event_id derived from content (not a UUID),
# deduplicate within a time window to handle near-duplicate events
def deduplicate_by_fingerprint(event):
# Fingerprint: hash of event content + time bucket
time_bucket = int(event.timestamp / 300) * 300 # 5-minute buckets
fingerprint = sha256(f'{event.user_id}:{event.action}:{time_bucket}')
key = f'dedup:fp:{fingerprint}'
is_new = redis.set(key, event.id, nx=True, ex=600)
if not is_new:
log.info(f'Deduped event: fingerprint {fingerprint} seen in this window')
return False
return True
Key Interview Points
- INSERT ON CONFLICT DO NOTHING is the canonical pattern: Atomic at the DB level. No race condition between two concurrent workers processing the same event — one wins the INSERT, the other gets rowcount=0.
- Event ID must come from the producer: The consumer should not generate the deduplication key. The producer assigns a stable event_id before publishing so retries use the same ID.
- TTL for cleanup: ProcessedEvent records can be dropped after N days (events older than N days will never be redelivered). Use table partitioning by month and DROP PARTITION for instant cleanup without slow DELETE operations.
- Redis is fast but not durable: Redis keys can be evicted under memory pressure. For financial events, use DB-backed deduplication. For high-throughput analytics events where duplicates are acceptable, Redis alone is sufficient.
Event deduplication and message queue design is discussed in Amazon system design interview questions.
Event deduplication and payment processing design is covered in Stripe system design interview preparation.
Event deduplication and distributed system design is discussed in Uber system design interview guide.