Notification System: Low-Level Design

A notification system delivers messages to users through one or more channels — push notifications, email, SMS, in-app. At scale (billions of notifications per day), the system must handle fan-out to millions of subscribers, manage delivery failures and retries, respect user preferences, and ensure no notification is lost or duplicated. This is a critical infrastructure component at every major consumer tech company.

Architecture Overview

A notification system has three layers: (1) Ingestion API: services publish notification requests (send this message to these users via these channels). Accepts and validates the request, writes to a durable queue. (2) Router/Fan-out: reads from the queue, expands recipient lists (group notifications to individual users), fetches user channel preferences, and routes to channel-specific workers. (3) Channel Workers: per-channel processors (push worker, email worker, SMS worker) that call third-party APIs (APNs/FCM for push, SendGrid for email, Twilio for SMS), handle failures, and update delivery status.

Fan-Out Strategies

When sending a notification to all followers of a user (e.g., “User A posted a photo”), the system must fan out to potentially millions of recipients. Two approaches: Fan-out on write (push model): when the event occurs, immediately enqueue a notification task for each recipient. Fast delivery but expensive for users with millions of followers (1M writes per event). Fan-out on read (pull model): store the notification once; when each user opens the app, fetch their notifications by querying. Slow delivery (requires active user) but efficient storage. Hybrid: fan out immediately to active users (online, recent session), defer to read-time for inactive users. Twitter and Facebook use this hybrid model.

User Preference Management

Users configure which notifications they receive on which channels. Preference schema: user_id, notification_type (new_follower, comment, message), channel (push, email, sms), enabled (boolean), frequency (immediate, daily_digest). The router fetches preferences before routing — if the user disabled email for comments, skip the email worker. Cache preferences in Redis (user_id → preferences map) with a short TTL (5 minutes) — preferences change infrequently but are read on every notification. Respect opt-out immediately: a user who unsubscribes must stop receiving that notification type within minutes, not after cache expiry.

Delivery Failure and Retry

Third-party channels fail: APNs returns a device not registered error (device token is stale — remove it), SendGrid returns a 429 (rate limited — retry after delay), Twilio returns a 500 (transient error — retry with backoff). Retry strategy: exponential backoff with jitter, 3-5 maximum retries, then route to a dead letter queue for investigation. Track delivery status per notification: pending → sent → delivered → failed. On failed device token: remove from the user’s token list to avoid sending to invalid devices (which APNs penalizes). On permanent failure: mark the notification as undeliverable and potentially surface it via in-app notification as a fallback.

Deduplication

At-least-once delivery causes duplicate notifications — a user receives “New follower: Alice” twice. Deduplicate with a notification ID: assign a UUID to each notification at creation. Before sending, check a deduplication store (Redis SET with TTL matching delivery window): if notification_id exists, skip. After successful send, add notification_id to the deduplication store. The TTL should be longer than the maximum retry window (e.g., 24 hours) to prevent duplicates from retry loops. Idempotency keys on third-party API calls (Twilio supports idempotency keys) prevent duplicates from provider-level retries.

Throttling and Rate Limiting

Per-user throttling: limit notifications per user per channel per hour to prevent notification fatigue. Implement with a per-user token bucket: if the bucket is exhausted, drop or aggregate the notification (send a digest “You have 5 new comments” instead of 5 individual notifications). Per-channel global rate limits: APNs and FCM have rate limits per app; SendGrid limits per account. Track outbound rate with a sliding window counter and back off when approaching limits. Notification priority: marketing/promotional notifications have lower priority than transactional (order confirmation, password reset) — throttle promotional first when under rate pressure.

See also: Meta Interview Guide 2026: Facebook, Instagram, WhatsApp Engineering

See also: Uber Interview Guide 2026: Dispatch Systems, Geospatial Algorithms, and Marketplace Engineering

See also: Netflix Interview Guide 2026: Streaming Architecture, Recommendation Systems, and Engineering Excellence

See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering

See also: LinkedIn Interview Guide 2026: Social Graph Engineering, Feed Ranking, and Professional Network Scale

See also: Airbnb Interview Guide 2026: Search Systems, Trust and Safety, and Full-Stack Engineering

See also: Databricks Interview Guide 2026: Spark Internals, Delta Lake, and Lakehouse Architecture

See also: Anthropic Interview Guide 2026: Process, Questions, and AI Safety

See also: Atlassian Interview Guide

See also: Coinbase Interview Guide

See also: Shopify Interview Guide

See also: Snap Interview Guide

See also: Lyft Interview Guide 2026: Rideshare Engineering, Real-Time Dispatch, and Safety Systems

See also: Stripe Interview Guide 2026: Process, Bug Bash Round, and Payment Systems

Scroll to Top