Scheduled Notification Low-Level Design: Future Delivery, Timezone Handling, and Recurring Schedules

A scheduled notification system delivers messages at a future time specified by the user or application logic — appointment reminders, subscription renewal alerts, drip campaign emails. Core challenges: efficient polling or time-based indexing for due notifications, timezone-correct scheduling, handling missed fire times during downtime, and supporting recurring schedules without drift.

Core Data Model

CREATE TYPE notification_status AS ENUM ('pending','processing','sent','failed','cancelled');
CREATE TYPE notification_channel AS ENUM ('push','email','sms','in_app');

CREATE TABLE ScheduledNotification (
    notification_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    user_id         UUID NOT NULL,
    channel         notification_channel NOT NULL,
    template_id     TEXT NOT NULL,
    template_vars   JSONB NOT NULL DEFAULT '{}',
    send_at_utc     TIMESTAMPTZ NOT NULL,           -- always stored in UTC
    status          notification_status NOT NULL DEFAULT 'pending',
    -- Recurrence
    recurrence_rule TEXT,                            -- iCal RRULE, e.g. "FREQ=WEEKLY;BYDAY=MO"
    recurrence_end  TIMESTAMPTZ,
    -- Delivery tracking
    locked_by       TEXT,                            -- worker ID, for lease-based locking
    locked_until    TIMESTAMPTZ,
    sent_at         TIMESTAMPTZ,
    failure_reason  TEXT,
    retry_count     INT NOT NULL DEFAULT 0,
    created_at      TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- Index that the polling worker uses — partial index on pending only
CREATE INDEX idx_notification_due
    ON ScheduledNotification (send_at_utc ASC)
    WHERE status = 'pending';

-- Track next fire time for recurring notifications separately
CREATE TABLE RecurringSchedule (
    schedule_id     UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    user_id         UUID NOT NULL,
    template_id     TEXT NOT NULL,
    template_vars   JSONB NOT NULL DEFAULT '{}',
    channel         notification_channel NOT NULL,
    rrule           TEXT NOT NULL,          -- FREQ=DAILY;BYHOUR=9;BYMINUTE=0
    user_timezone   TEXT NOT NULL,          -- IANA tz, e.g. "America/New_York"
    next_fire_at    TIMESTAMPTZ NOT NULL,
    is_active       BOOLEAN NOT NULL DEFAULT TRUE,
    created_at      TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE INDEX idx_recurring_next ON RecurringSchedule (next_fire_at) WHERE is_active = TRUE;

Scheduling a Notification

from datetime import datetime
from zoneinfo import ZoneInfo
from uuid import uuid4
import psycopg2

def schedule_notification(
    conn,
    user_id: str,
    channel: str,
    template_id: str,
    template_vars: dict,
    send_at_local: datetime,   # naive or timezone-aware datetime
    user_timezone: str = "UTC"  # IANA timezone string
) -> str:
    """
    Accept local time + timezone, convert to UTC before storing.
    This correctly handles DST transitions.
    """
    if send_at_local.tzinfo is None:
        # Localize naive datetime to user's timezone
        tz = ZoneInfo(user_timezone)
        send_at_aware = send_at_local.replace(tzinfo=tz)
    else:
        send_at_aware = send_at_local

    # Convert to UTC for storage
    send_at_utc = send_at_aware.astimezone(ZoneInfo("UTC"))

    notification_id = str(uuid4())
    with conn.cursor() as cur:
        cur.execute(
            """INSERT INTO ScheduledNotification
               (notification_id, user_id, channel, template_id, template_vars, send_at_utc)
               VALUES (%s, %s, %s, %s, %s, %s)""",
            (notification_id, user_id, channel, template_id,
             psycopg2.extras.Json(template_vars), send_at_utc)
        )
    conn.commit()
    return notification_id

Worker: Polling and Lease-Based Locking

import time, socket, logging
from datetime import datetime, timezone, timedelta

WORKER_ID = socket.gethostname()
LOCK_DURATION_SEC = 60       # worker has 60s to process before lock expires
BATCH_SIZE = 100
POLL_INTERVAL_SEC = 5

def run_notification_worker(conn):
    """
    Polling worker with lease-based locking:
    - Atomically claims a batch of due notifications using SELECT ... FOR UPDATE SKIP LOCKED
    - Sets locked_by and locked_until to prevent concurrent workers from picking the same job
    - Processes each notification; updates status to 'sent' or 'failed'
    """
    while True:
        claimed = claim_due_notifications(conn)
        for notification in claimed:
            process_notification(conn, notification)
        time.sleep(POLL_INTERVAL_SEC)

def claim_due_notifications(conn) -> list[dict]:
    now = datetime.now(timezone.utc)
    lock_until = now + timedelta(seconds=LOCK_DURATION_SEC)

    with conn.cursor() as cur:
        cur.execute("""
            UPDATE ScheduledNotification
            SET status = 'processing',
                locked_by = %s,
                locked_until = %s
            WHERE notification_id IN (
                SELECT notification_id FROM ScheduledNotification
                WHERE status = 'pending'
                  AND send_at_utc = 3 THEN 'failed' ELSE 'pending' END,
                    retry_count = retry_count + 1,
                    failure_reason = %s,
                    locked_by = NULL,
                    locked_until = NULL,
                    send_at_utc = CASE WHEN retry_count < 3 THEN NOW() + interval '5 minutes' ELSE send_at_utc END
                WHERE notification_id = %s
            """, (str(e), nid))
        conn.commit()

Recurring Notifications

from dateutil.rrule import rrulestr
from zoneinfo import ZoneInfo

def advance_recurring_schedule(conn, schedule_id: str):
    """
    After firing a recurring notification, compute the next fire time
    using the RRULE and update next_fire_at.
    """
    with conn.cursor() as cur:
        cur.execute(
            "SELECT rrule, user_timezone, next_fire_at, recurrence_end FROM RecurringSchedule WHERE schedule_id = %s",
            (schedule_id,)
        )
        row = cur.fetchone()
    if not row:
        return
    rrule_str, user_tz_name, last_fire, end_at = row
    user_tz = ZoneInfo(user_tz_name)

    # Parse RRULE and find next occurrence after last_fire
    rule = rrulestr(rrule_str, dtstart=last_fire.astimezone(user_tz))
    next_fire_local = rule.after(last_fire.astimezone(user_tz))

    if next_fire_local is None or (end_at and next_fire_local > end_at):
        # No more occurrences — deactivate
        with conn.cursor() as cur:
            cur.execute("UPDATE RecurringSchedule SET is_active = FALSE WHERE schedule_id = %s", (schedule_id,))
    else:
        next_fire_utc = next_fire_local.astimezone(ZoneInfo("UTC"))
        with conn.cursor() as cur:
            cur.execute(
                "UPDATE RecurringSchedule SET next_fire_at = %s WHERE schedule_id = %s",
                (next_fire_utc, schedule_id)
            )
    conn.commit()

Key Interview Points

Polling vs push: Polling with SELECT … WHERE send_at_utc <= NOW() is simple and reliable. With the partial index on pending rows, this query is fast even with millions of notifications. Alternative: use pg_notify or Redis sorted sets scored by Unix timestamp — ZRANGEBYSCORE key 0 now_ts to pop due notifications without a polling loop. Redis approach reduces DB load but adds infrastructure complexity.
FOR UPDATE SKIP LOCKED: This is the standard pattern for job queues in Postgres. It atomically locks rows being processed and skips rows already locked by other workers, preventing duplicate processing without a separate lock table. Multiple workers can run in parallel safely.
Timezone handling: “Send at 9am every Monday” must fire at 9am in the user’s local time — which shifts with DST. Always store in UTC but compute local fire time using the IANA timezone. A naive UTC-offset (+5:30) breaks during DST transitions because offsets change. dateutil.rrule respects DST when given a timezone-aware dtstart.
Missed fire recovery: If workers were down for 2 hours, there may be 1,000 overdue notifications. The worker picks them up in send_at_utc order (FIFO), which is typically correct. For time-sensitive notifications (appointment reminders), add a staleness check: if send_at_utc < NOW() – interval ‘1 hour’, skip and mark as ‘failed’ with reason ‘expired’. Configure per-template.
Fan-out at scale: For a campaign sending 10M notifications at 8am, insert all 10M rows and let the worker fleet process them. Scale workers horizontally — each worker claims a batch of 100 with SKIP LOCKED. At 100 workers × 100 batch = 10,000 notifications/sec, 10M notifications take 1,000 seconds (~17 minutes). Pre-insert the batch days in advance and scale the worker fleet for campaign windows.

{“@context”:”https://schema.org”,”@type”:”FAQPage”,”mainEntity”:[{“@type”:”Question”,”name”:”Why must scheduled notifications be stored in UTC rather than local time?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Local time is ambiguous: "9am on March 10" means different UTC timestamps for users in New York vs. Tokyo — and it changes when DST transitions occur. A notification scheduled for "8am local" that was stored as UTC-05:00 will fire at 7am local after the US spring-forward transition (clocks move +1 hour, the UTC offset changes from -5 to -4). UTC is unambiguous: it is always the same regardless of locale or DST. The correct pattern: accept local time + IANA timezone from the user (America/New_York), convert to UTC using that timezone’s current offset, store the UTC timestamp. On re-render for the user, convert back to their local time. Store the IANA timezone separately for recurring rules (RRULEs) where future occurrences must be computed using the correct offset at that future date.”}},{“@type”:”Question”,”name”:”What is SELECT FOR UPDATE SKIP LOCKED and why is it used for notification workers?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”SELECT FOR UPDATE acquires a row-level lock on each selected row, preventing other transactions from modifying them. SKIP LOCKED skips rows that are already locked by another transaction rather than waiting. Combined, this enables multiple worker processes to atomically claim disjoint batches of pending notifications without coordination or a separate lock table. Process A claims rows 1–100; Process B claims rows 101–200 — no overlap, no blocking, no duplicate sends. Without SKIP LOCKED, multiple workers would contend on the same rows, causing lock waits and reducing throughput. The pattern replaces Redis-based job queues for systems already using PostgreSQL: no additional infrastructure needed, and the notification history table serves as both the queue and the audit log.”}},{“@type”:”Question”,”name”:”How do you handle the case where a scheduled notification fires late due to worker downtime?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”If workers were down for 2 hours, potentially thousands of overdue notifications are queued. The worker picks them up in send_at_utc order (oldest first). For most notifications (daily digests, drip emails) this is correct — deliver them as soon as possible. For time-sensitive ones (appointment reminder: "Your meeting starts in 15 minutes"), firing 2 hours late is useless or confusing. Solution: add a staleness_threshold field to the notification (or template). If NOW() – send_at_utc > staleness_threshold, mark as ‘expired’ instead of sending and log it. Alternatively, classify notifications by urgency: ASAP (retry immediately), time_sensitive (expire if > 30min late), digest (expire if > 24h late). The staleness check runs inside the processing function before dispatching.”}},{“@type”:”Question”,”name”:”How do recurring notifications avoid drift over months?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Naive drift: add 7 days to last_fire_at every Monday. If a Monday job fires at 09:00:05 instead of 09:00:00 due to worker load, the next fire is at 09:00:05 next Monday. After 52 weeks, it fires at 09:00:52 — visible drift. Correct approach: use the RRULE to compute the next occurrence from a fixed anchor date, not from the last actual fire time. The dateutil.rrule library does this correctly: given FREQ=WEEKLY;BYDAY=MO and dtstart=2026-01-05T09:00:00, it always returns 09:00:00 on each Monday regardless of when it last fired. next_fire = rule.after(datetime.now(user_tz)) returns the next occurrence strictly after now — anchored to the original schedule.”}},{“@type”:”Question”,”name”:”How do you scale notification workers horizontally during campaign sends?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”A marketing campaign schedules 10 million notifications at 8am tomorrow. With one worker processing 100 notifications/second, processing takes 27 hours — the campaign arrives days late. Horizontal scaling with SELECT FOR UPDATE SKIP LOCKED scales linearly: 100 workers × 100 notifications/sec = 10,000/sec, finishing in 1,000 seconds (~17 minutes). Pre-insert the campaign notifications hours in advance (during off-peak). Scale the worker fleet before 8am using an autoscaling rule (scheduled scaling in Kubernetes: increase replica count to 100 at 07:50am). The SKIP LOCKED pattern ensures workers share the work without coordination overhead. Post-campaign, scale down. Monitor queue depth (COUNT(*) WHERE status=’pending’ AND send_at_utc <= NOW()) as the key metric for scaling decisions.”}}]}

Scheduled notification and booking reminder system design is discussed in Airbnb system design interview questions.

Scheduled notification and job alert delivery design is covered in LinkedIn system design interview preparation.

Scheduled notification and financial alert system design is discussed in Coinbase system design interview guide.