What Is a Do-Not-Disturb Service?
A Do-Not-Disturb (DND) Service suppresses or delays outbound notifications during time windows that a user declares off-limits. Unlike a simple channel mute, DND is time-aware: it must evaluate the user’s local time zone, compare it against one or more configured windows, and decide in real time whether a pending notification should fire, be held in a queue, or be discarded. It sits between the notification router and the actual delivery adapters (email gateway, APNs, FCM, SMS carrier).
Data Model
-- User-defined quiet windows (multiple windows allowed per user)
CREATE TABLE dnd_schedules (
id BIGINT PRIMARY KEY AUTO_INCREMENT,
user_id BIGINT NOT NULL,
label VARCHAR(64), -- e.g. 'Nights', 'Weekends'
start_time TIME NOT NULL, -- e.g. 22:00:00
end_time TIME NOT NULL, -- e.g. 07:00:00 (next day)
days_mask TINYINT UNSIGNED NOT NULL DEFAULT 127, -- bitmask Mon-Sun
timezone VARCHAR(64) NOT NULL, -- IANA tz, e.g. America/New_York
enabled BOOLEAN NOT NULL DEFAULT TRUE,
created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
INDEX idx_user (user_id)
);
-- Holds notifications that arrived during a DND window
CREATE TABLE dnd_held_notifications (
id BIGINT PRIMARY KEY AUTO_INCREMENT,
user_id BIGINT NOT NULL,
notification_id VARCHAR(128) NOT NULL, -- idempotency key from sender
payload JSON NOT NULL,
channel VARCHAR(16) NOT NULL,
held_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
release_after TIMESTAMP NOT NULL, -- computed end of DND window
status ENUM('held','released','discarded') NOT NULL DEFAULT 'held',
INDEX idx_release (release_after, status)
);
Core Algorithm: Evaluating DND
When a notification arrives at the router, the DND service runs dndCheck(userId, channel, notificationPriority):
- Load schedules. Fetch all enabled
dnd_schedulesfor the user from cache (TTL 5 min) or DB. - Convert to user local time. Resolve the notification arrival timestamp to the user’s IANA time zone stored on each schedule row.
- Check each window. For each schedule, check whether the current day bit is set in
days_maskand whether local time falls within[start_time, end_time). Handle overnight windows (start > end) by treating them as two half-ranges straddling midnight. - Priority override. If notification priority is
CRITICAL(e.g., security alerts, account lockouts), bypass DND entirely and deliver immediately. - Hold or discard. For non-critical hits, write a row to
dnd_held_notificationswithrelease_afterset to the computed window end time. Return aHELDresult to the router. - Release job. A scheduled worker polls
dnd_held_notifications WHERE status='held' AND release_after <= NOW()every minute, re-submits the payload to the delivery adapters, and flips the status toreleased.
Failure Handling
Missing time zone: Default to UTC and log a warning. Better to deliver in an unexpected window than to silently hold forever.
Release worker crash: Because release_after is persisted, a restarted worker simply re-queries and picks up any unprocessed rows. Use a row-level SELECT ... FOR UPDATE SKIP LOCKED to prevent double-delivery when multiple worker instances run.
Held notification TTL: Add a discard_after column (e.g., held_at + 24h). A separate cleanup job marks rows discarded once the notification is no longer actionable, preventing the hold table from growing unboundedly.
Schedule cache invalidation: Invalidate the cache key immediately on any schedule write so the next check sees fresh data. A stale window at write time would cause at most one misfire before the TTL expires.
Scalability Considerations
Read path: Schedule lookups are a per-user, per-notification overhead. A Redis hash per user (keyed dnd:{userId}) holding a serialized schedule list keeps latency under 1 ms for the common case.
Release worker throughput: Partition the hold table by user_id % N and run N worker shards, each responsible for its partition. Alternatively, use a priority queue (e.g., Redis sorted set scored by release_after) to avoid polling the DB entirely.
Fan-out: If a marketing blast hits millions of users simultaneously and many are in DND windows, the hold table can see spikes. Pre-compute release buckets during blast scheduling to smooth the write load.
Multi-schedule overlap: Users may configure overlapping windows. The algorithm takes the union: any matching window triggers a hold, and release_after is the maximum end time across all matching windows.
Summary
The Do-Not-Disturb Service uses a bitmask-plus-time-range schema to represent flexible quiet windows in user-local time, intercepts non-critical notifications during those windows by persisting them to a held-notifications table, and releases them via a poll-or-queue worker once the window closes. Priority overrides and TTL-based discard rules keep the system safe and bounded under all failure conditions.
See also: Meta Interview Guide 2026: Facebook, Instagram, WhatsApp Engineering
See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering