Low Level Design: Do-Not-Disturb Service

What Is a Do-Not-Disturb Service?

A Do-Not-Disturb (DND) Service suppresses or delays outbound notifications during time windows that a user declares off-limits. Unlike a simple channel mute, DND is time-aware: it must evaluate the user’s local time zone, compare it against one or more configured windows, and decide in real time whether a pending notification should fire, be held in a queue, or be discarded. It sits between the notification router and the actual delivery adapters (email gateway, APNs, FCM, SMS carrier).

Data Model

-- User-defined quiet windows (multiple windows allowed per user)
CREATE TABLE dnd_schedules (
    id          BIGINT PRIMARY KEY AUTO_INCREMENT,
    user_id     BIGINT      NOT NULL,
    label       VARCHAR(64),                        -- e.g. 'Nights', 'Weekends'
    start_time  TIME        NOT NULL,               -- e.g. 22:00:00
    end_time    TIME        NOT NULL,               -- e.g. 07:00:00 (next day)
    days_mask   TINYINT UNSIGNED NOT NULL DEFAULT 127, -- bitmask Mon-Sun
    timezone    VARCHAR(64) NOT NULL,               -- IANA tz, e.g. America/New_York
    enabled     BOOLEAN     NOT NULL DEFAULT TRUE,
    created_at  TIMESTAMP   NOT NULL DEFAULT CURRENT_TIMESTAMP,
    INDEX idx_user (user_id)
);

-- Holds notifications that arrived during a DND window
CREATE TABLE dnd_held_notifications (
    id              BIGINT PRIMARY KEY AUTO_INCREMENT,
    user_id         BIGINT       NOT NULL,
    notification_id VARCHAR(128) NOT NULL,  -- idempotency key from sender
    payload         JSON         NOT NULL,
    channel         VARCHAR(16)  NOT NULL,
    held_at         TIMESTAMP    NOT NULL DEFAULT CURRENT_TIMESTAMP,
    release_after   TIMESTAMP    NOT NULL,  -- computed end of DND window
    status          ENUM('held','released','discarded') NOT NULL DEFAULT 'held',
    INDEX idx_release (release_after, status)
);

Core Algorithm: Evaluating DND

When a notification arrives at the router, the DND service runs dndCheck(userId, channel, notificationPriority):

Load schedules. Fetch all enabled dnd_schedules for the user from cache (TTL 5 min) or DB.
Convert to user local time. Resolve the notification arrival timestamp to the user’s IANA time zone stored on each schedule row.
Check each window. For each schedule, check whether the current day bit is set in days_mask and whether local time falls within [start_time, end_time). Handle overnight windows (start > end) by treating them as two half-ranges straddling midnight.
Priority override. If notification priority is CRITICAL (e.g., security alerts, account lockouts), bypass DND entirely and deliver immediately.
Hold or discard. For non-critical hits, write a row to dnd_held_notifications with release_after set to the computed window end time. Return a HELD result to the router.
Release job. A scheduled worker polls dnd_held_notifications WHERE status='held' AND release_after <= NOW() every minute, re-submits the payload to the delivery adapters, and flips the status to released.

Failure Handling

Missing time zone: Default to UTC and log a warning. Better to deliver in an unexpected window than to silently hold forever.

Release worker crash: Because release_after is persisted, a restarted worker simply re-queries and picks up any unprocessed rows. Use a row-level SELECT ... FOR UPDATE SKIP LOCKED to prevent double-delivery when multiple worker instances run.

Held notification TTL: Add a discard_after column (e.g., held_at + 24h). A separate cleanup job marks rows discarded once the notification is no longer actionable, preventing the hold table from growing unboundedly.

Schedule cache invalidation: Invalidate the cache key immediately on any schedule write so the next check sees fresh data. A stale window at write time would cause at most one misfire before the TTL expires.

Scalability Considerations

Read path: Schedule lookups are a per-user, per-notification overhead. A Redis hash per user (keyed dnd:{userId}) holding a serialized schedule list keeps latency under 1 ms for the common case.

Release worker throughput: Partition the hold table by user_id % N and run N worker shards, each responsible for its partition. Alternatively, use a priority queue (e.g., Redis sorted set scored by release_after) to avoid polling the DB entirely.

Fan-out: If a marketing blast hits millions of users simultaneously and many are in DND windows, the hold table can see spikes. Pre-compute release buckets during blast scheduling to smooth the write load.

Multi-schedule overlap: Users may configure overlapping windows. The algorithm takes the union: any matching window triggers a hold, and release_after is the maximum end time across all matching windows.

Summary

The Do-Not-Disturb Service uses a bitmask-plus-time-range schema to represent flexible quiet windows in user-local time, intercepts non-critical notifications during those windows by persisting them to a held-notifications table, and releases them via a poll-or-queue worker once the window closes. Priority overrides and TTL-based discard rules keep the system safe and bounded under all failure conditions.

{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “What is a Do Not Disturb (DND) service in system design?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “A Do Not Disturb service is a component that suppresses or defers notifications during user-defined time windows or device states. It sits in the notification dispatch pipeline and, before delivering any message, checks whether the recipient is currently in a DND period. Suppressed notifications may be dropped, logged, or held in a queue for delivery once DND ends.”
}
},
{
“@type”: “Question”,
“name”: “How do you implement time-zone-aware DND windows?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Store DND windows in UTC or store the user’s IANA time zone alongside local start/end times, then convert to UTC at query time. When checking whether a notification should be suppressed, convert the current UTC timestamp to the user’s local time and compare against the stored window. Recompute UTC boundaries whenever the user updates their time zone or DND schedule to avoid stale conversions.”
}
},
{
“@type”: “Question”,
“name”: “How should a DND service handle high-priority or emergency notifications?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Define a priority or urgency level on every notification type. The DND service should expose a configurable bypass policy: critical alerts (security alerts, emergency broadcasts) skip DND checks entirely, while standard and marketing notifications are suppressed. Users may also be given the option to whitelist specific senders or categories that can always break through DND.”
}
},
{
“@type”: “Question”,
“name”: “What data store is appropriate for DND schedules and how are they queried efficiently?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “DND schedules are small per-user records and fit naturally in a key-value store keyed by user ID, allowing O(1) lookups. For large-scale systems, cache the schedule in Redis or a local in-process cache (with TTL-based invalidation) to avoid a database round-trip on every notification. When a user updates their DND settings, publish a cache invalidation event so dispatch workers pick up the change without waiting for TTL expiry.”
}
}
]
}