Content Scheduler Low-Level Design: Scheduled Publishing, Timezone Handling, and Recurring Content

Overview

A content scheduler lets authors queue posts, stories, social updates, or any content entity for automatic publishing or unpublishing at a future time. The system must handle timezone-correct scheduling, survive worker restarts without double-publishing, support recurring patterns (every Monday, first of month), and catch up on missed schedules when workers come back online after downtime. This LLD covers the data model, scheduler worker mechanics, timezone handling, recurrence rule processing, missed-schedule catch-up, and the key design decisions that keep it reliable in production.

Core Data Model


-- One row per scheduled action on a content entity
CREATE TABLE ScheduledContent (
    schedule_id         BIGSERIAL     PRIMARY KEY,
    content_id          UUID          NOT NULL,           -- the ID of the content entity
    entity_type         VARCHAR(64)   NOT NULL,           -- 'post', 'story', 'product', 'banner'
    entity_id           BIGINT        NOT NULL,           -- FK into the relevant table
    action              VARCHAR(32)   NOT NULL,           -- 'publish', 'unpublish', 'archive'
    scheduled_at_utc    TIMESTAMPTZ   NOT NULL,           -- when to execute (always UTC)
    timezone            VARCHAR(64)   NOT NULL,           -- user's timezone at time of scheduling
    local_time_display  VARCHAR(32),                      -- e.g. "2025-06-01 09:00" for display
    status              VARCHAR(16)   NOT NULL DEFAULT 'pending',
    -- pending -> processing -> done | failed | skipped
    recurrence_rule     TEXT,                             -- RRULE string, NULL for one-off
    next_schedule_id    BIGINT        REFERENCES ScheduledContent(schedule_id),
    retry_count         SMALLINT      NOT NULL DEFAULT 0,
    max_retries         SMALLINT      NOT NULL DEFAULT 3,
    last_error          TEXT,
    created_by          BIGINT        NOT NULL,
    created_at          TIMESTAMPTZ   NOT NULL DEFAULT NOW(),
    executed_at         TIMESTAMPTZ
);

CREATE INDEX idx_sc_pending ON ScheduledContent(scheduled_at_utc)
    WHERE status = 'pending';
CREATE INDEX idx_sc_entity ON ScheduledContent(entity_type, entity_id, status);

-- Execution audit log — every run of a scheduled action
CREATE TABLE ScheduledContentLog (
    log_id              BIGSERIAL     PRIMARY KEY,
    schedule_id         BIGINT        NOT NULL REFERENCES ScheduledContent(schedule_id),
    entity_type         VARCHAR(64)   NOT NULL,
    entity_id           BIGINT        NOT NULL,
    action              VARCHAR(32)   NOT NULL,
    status              VARCHAR(16)   NOT NULL,   -- 'success', 'failure', 'skipped'
    execution_started   TIMESTAMPTZ   NOT NULL,
    execution_finished  TIMESTAMPTZ,
    worker_id           VARCHAR(64),               -- hostname:pid of executing worker
    error_message       TEXT,
    created_at          TIMESTAMPTZ   NOT NULL DEFAULT NOW()
);

Scheduler Worker

Workers poll the ScheduledContent table for due rows. SKIP LOCKED is critical — it lets multiple workers run in parallel without competing for the same rows.


import os
import time
import socket
import psycopg2
from datetime import datetime, timezone, timedelta

WORKER_ID       = f"{socket.gethostname()}:{os.getpid()}"
POLL_INTERVAL   = 5        # seconds between polls
BATCH_SIZE      = 20       # rows per poll cycle
CATCH_UP_WINDOW = timedelta(hours=24)  # how far back to catch up missed schedules

def scheduler_worker_loop(db_conn):
    print(f"Scheduler worker started: {WORKER_ID}")
    while True:
        try:
            process_due_schedules(db_conn)
        except Exception as e:
            print(f"Worker error: {e}")
            db_conn.rollback()
        time.sleep(POLL_INTERVAL)


def process_due_schedules(db_conn):
    now = datetime.now(timezone.utc)
    catchup_cutoff = now - CATCH_UP_WINDOW

    with db_conn.cursor() as cur:
        # SKIP LOCKED: skip rows being processed by other workers
        cur.execute("""
            SELECT schedule_id, entity_type, entity_id, action,
                   scheduled_at_utc, recurrence_rule, retry_count, max_retries
            FROM ScheduledContent
            WHERE status = 'pending'
              AND scheduled_at_utc = %s   -- don't catch up schedules older than window
            ORDER BY scheduled_at_utc ASC
            LIMIT %s
            FOR UPDATE SKIP LOCKED
        """, (now, catchup_cutoff, BATCH_SIZE))
        rows = cur.fetchall()

        if not rows:
            return

        # Mark all fetched rows as 'processing' atomically
        schedule_ids = [r[0] for r in rows]
        cur.execute("""
            UPDATE ScheduledContent
            SET status = 'processing'
            WHERE schedule_id = ANY(%s)
        """, (schedule_ids,))
    db_conn.commit()

    for row in rows:
        execute_schedule(db_conn, *row)


def execute_schedule(db_conn, schedule_id, entity_type, entity_id, action,
                     scheduled_at_utc, recurrence_rule, retry_count, max_retries):
    started_at = datetime.now(timezone.utc)
    log_id = None

    with db_conn.cursor() as cur:
        cur.execute("""
            INSERT INTO ScheduledContentLog
                (schedule_id, entity_type, entity_id, action, status,
                 execution_started, worker_id)
            VALUES (%s, %s, %s, %s, 'running', %s, %s)
            RETURNING log_id
        """, (schedule_id, entity_type, entity_id, action, started_at, WORKER_ID))
        log_id = cur.fetchone()[0]
    db_conn.commit()

    try:
        perform_action(entity_type, entity_id, action)

        with db_conn.cursor() as cur:
            cur.execute("""
                UPDATE ScheduledContent
                SET status='done', executed_at=NOW()
                WHERE schedule_id=%s
            """, (schedule_id,))
            cur.execute("""
                UPDATE ScheduledContentLog
                SET status='success', execution_finished=NOW()
                WHERE log_id=%s
            """, (log_id,))
        db_conn.commit()

        # Schedule next recurrence if applicable
        if recurrence_rule:
            create_next_recurrence(db_conn, schedule_id, recurrence_rule, scheduled_at_utc)

    except Exception as e:
        handle_failure(db_conn, schedule_id, log_id, str(e), retry_count, max_retries)


def perform_action(entity_type, entity_id, action):
    """
    Idempotent content action. Safe to call multiple times — publish on
    an already-published entity is a no-op at the application layer.
    """
    if action == 'publish':
        content_service.publish(entity_type, entity_id)
    elif action == 'unpublish':
        content_service.unpublish(entity_type, entity_id)
    elif action == 'archive':
        content_service.archive(entity_type, entity_id)
    else:
        raise ValueError(f"Unknown action: {action}")


def handle_failure(db_conn, schedule_id, log_id, error_msg, retry_count, max_retries):
    new_retry_count = retry_count + 1
    if new_retry_count <= max_retries:
        # Exponential back-off: retry after 2^retry_count minutes
        retry_delay = timedelta(minutes=2 ** retry_count)
        retry_at = datetime.now(timezone.utc) + retry_delay
        new_status = 'pending'
    else:
        retry_at = None
        new_status = 'failed'

    with db_conn.cursor() as cur:
        cur.execute("""
            UPDATE ScheduledContent
            SET status=%s,
                retry_count=%s,
                last_error=%s,
                scheduled_at_utc=COALESCE(%s, scheduled_at_utc)
            WHERE schedule_id=%s
        """, (new_status, new_retry_count, error_msg, retry_at, schedule_id))
        cur.execute("""
            UPDATE ScheduledContentLog
            SET status='failure', execution_finished=NOW(), error_message=%s
            WHERE log_id=%s
        """, (error_msg, log_id))
    db_conn.commit()

Timezone Handling

All scheduled_at_utc values are stored in UTC. Timezone conversion happens at schedule creation time — never during execution.


import pytz
from datetime import datetime

def create_scheduled_action(db_conn, entity_type, entity_id, action,
                             local_datetime_str, user_timezone, recurrence_rule=None, created_by=None):
    """
    Accept a naive local datetime string from the UI (e.g. '2025-06-01 09:00')
    and convert to UTC for storage. Display string preserved for UI accuracy.
    """
    tz = pytz.timezone(user_timezone)

    # Parse the naive datetime and localize to user's timezone
    naive_dt  = datetime.strptime(local_datetime_str, '%Y-%m-%d %H:%M')
    local_dt  = tz.localize(naive_dt, is_dst=None)  # is_dst=None raises on ambiguous DST times
    utc_dt    = local_dt.astimezone(pytz.utc)

    with db_conn.cursor() as cur:
        cur.execute("""
            INSERT INTO ScheduledContent
                (content_id, entity_type, entity_id, action,
                 scheduled_at_utc, timezone, local_time_display,
                 recurrence_rule, status, created_by)
            VALUES (gen_random_uuid(), %s, %s, %s, %s, %s, %s, %s, 'pending', %s)
            RETURNING schedule_id
        """, (entity_type, entity_id, action, utc_dt, user_timezone,
               local_datetime_str, recurrence_rule, created_by))
    db_conn.commit()

# Note on DST: is_dst=None causes pytz to raise pytz.exceptions.AmbiguousTimeError
# for clocks-fall-back ambiguous times. Surface this error to the user and ask them
# to select the pre- or post-DST interpretation explicitly.

Recurring Rules (RRULE)

Recurring schedules use the RFC 5545 RRULE format, processed by the rrule Python library. After each successful execution, the worker creates the next occurrence.


from dateutil.rrule import rrulestr
import pytz

def create_next_recurrence(db_conn, parent_schedule_id, rrule_string, last_scheduled_utc):
    """
    Compute the next occurrence after last_scheduled_utc using the RRULE.
    Insert a new ScheduledContent row for the next firing.

    Example RRULE strings:
      FREQ=DAILY                           — every day
      FREQ=WEEKLY;BYDAY=MO,WE,FR           — Mon/Wed/Fri
      FREQ=MONTHLY;BYDAY=1MO               — first Monday of month
      FREQ=WEEKLY;INTERVAL=2;BYDAY=TU      — every other Tuesday
    """
    with db_conn.cursor() as cur:
        cur.execute("""
            SELECT entity_type, entity_id, action, timezone,
                   local_time_display, recurrence_rule, created_by
            FROM ScheduledContent WHERE schedule_id=%s
        """, (parent_schedule_id,))
        parent = cur.fetchone()

    entity_type, entity_id, action, user_tz, local_time_display, recurrence_rule, created_by = parent

    # Evaluate rrule starting after the last execution to find the next date
    rule        = rrulestr(rrule_string, dtstart=last_scheduled_utc)
    next_utc_dt = rule.after(last_scheduled_utc)

    if next_utc_dt is None:
        return  # RRULE has no more occurrences (e.g. COUNT or UNTIL exhausted)

    # Preserve the original local hour in the user's timezone for display
    tz       = pytz.timezone(user_tz)
    next_local = next_utc_dt.astimezone(tz)
    next_local_display = next_local.strftime('%Y-%m-%d %H:%M')

    with db_conn.cursor() as cur:
        cur.execute("""
            INSERT INTO ScheduledContent
                (content_id, entity_type, entity_id, action,
                 scheduled_at_utc, timezone, local_time_display,
                 recurrence_rule, status, created_by)
            VALUES (gen_random_uuid(), %s, %s, %s, %s, %s, %s, %s, 'pending', %s)
            RETURNING schedule_id
        """, (entity_type, entity_id, action, next_utc_dt, user_tz,
               next_local_display, recurrence_rule, created_by))
        new_schedule_id = cur.fetchone()[0]

        # Link parent to child for audit trail
        cur.execute("""
            UPDATE ScheduledContent SET next_schedule_id=%s WHERE schedule_id=%s
        """, (new_schedule_id, parent_schedule_id))
    db_conn.commit()

Catch-Up for Missed Schedules

Workers use a catch-up window (CATCH_UP_WINDOW = 24h) to process schedules that fired while the system was down. Schedules older than the window are marked skipped to avoid publishing stale content.


-- Mark very old pending schedules as skipped (run as a maintenance job)
UPDATE ScheduledContent
SET status = 'skipped',
    last_error = 'Schedule missed and outside catch-up window'
WHERE status = 'pending'
  AND scheduled_at_utc < NOW() - INTERVAL '24 hours';

Key Design Decisions

UTC storage prevents DST bugs: Storing scheduled_at_utc in UTC means a job scheduled for “9 AM New York” on the day clocks spring forward still fires at the correct wall-clock time. The pytz.localize(is_dst=None) call catches ambiguous fall-back times at creation and forces the user to resolve the ambiguity explicitly, rather than silently choosing the wrong UTC offset.
RRULE standard for recurrence: RFC 5545 RRULE syntax (used by iCalendar, Google Calendar, and Outlook) covers every recurrence pattern needed — daily, weekly on specific days, nth weekday of month, with COUNT/UNTIL termination — without inventing a custom DSL. The dateutil.rrule library provides battle-tested evaluation. Storing the raw RRULE string means the pattern is portable and auditable.
SKIP LOCKED for parallel workers: Multiple scheduler worker processes can run simultaneously without coordination overhead. SELECT ... FOR UPDATE SKIP LOCKED makes each worker atomically claim its own set of rows and skip rows already claimed by another worker. This is more efficient and simpler than a Redis-based distributed lock per schedule row.
Idempotent publish action: The perform_action function is designed so that calling publish on an already-published entity is a safe no-op. This means if a worker crashes after executing the action but before writing status='done', and another worker retries the row, the content is not double-published or corrupted. Idempotency is the cheapest form of exactly-once semantics.

{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “How does SKIP LOCKED enable concurrent scheduler workers?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Multiple workers query for due items using SELECT FOR UPDATE SKIP LOCKED; each worker atomically claims rows others have locked, preventing double-processing.”
}
},
{
“@type”: “Question”,
“name”: “How are recurring schedules represented?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “RRULE strings (RFC 5545) are stored per scheduled item; dateutil.rrule parses them to compute the next occurrence.”
}
},
{
“@type”: “Question”,
“name”: “How is DST ambiguity handled during scheduling?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “pytz's localize() with is_dst=False resolves ambiguous wall-clock times (e.g., 1:30 AM during fall-back) to the standard-time interpretation.”
}
},
{
“@type”: “Question”,
“name”: “What is the catch-up window?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “If the scheduler is down, it only processes items up to N minutes overdue; older missed items are skipped and rescheduled forward to prevent thundering-herd catch-up.”
}
}
]
}