Digest Scheduler Low-Level Design: Aggregation Window, Priority Scoring, and Delivery Timing

A digest scheduler aggregates individual notifications that occurred within a time window into a single summary delivery. It reduces notification noise for users who would otherwise receive dozens of individual emails per day, while ensuring high-priority items still surface prominently and timezone-aware delivery hits inboxes at the right local time.

Requirements

Functional

Buffer notifications for users who have opted into digest mode.
Aggregate buffered items within a configurable window (hourly, daily).
Score and rank items by priority so the most important appear first.
Deliver digests at a timezone-appropriate time (e.g., 8 AM local).
Respect global channel unsubscribes and skip empty digests.

Non-Functional

Digest delivery within 5 minutes of the scheduled window close.
Support 10 million digest-enabled users across 500+ timezones.
At-least-once delivery with deduplication on the rendering side.

Data Model

digest_buffers(
  id               BIGSERIAL PRIMARY KEY,
  user_id          BIGINT,
  channel          ENUM(email, push),
  notification_type VARCHAR(100),
  payload          JSONB,
  priority_score   FLOAT,
  buffered_at      TIMESTAMPTZ,
  digest_window    VARCHAR(50),  -- e.g. daily, hourly
  delivered        BOOLEAN DEFAULT FALSE
)

digest_schedules(
  user_id          BIGINT,
  channel          ENUM(email, push),
  window           VARCHAR(50),
  delivery_time    TIME,         -- local time
  timezone         VARCHAR(60),
  PRIMARY KEY (user_id, channel)
)

digest_deliveries(
  id               BIGSERIAL PRIMARY KEY,
  user_id          BIGINT,
  channel          ENUM(email, push),
  window_start     TIMESTAMPTZ,
  window_end       TIMESTAMPTZ,
  item_count       INT,
  delivered_at     TIMESTAMPTZ,
  idempotency_key  VARCHAR(200) UNIQUE
)

Core Algorithms

Priority Scoring

Each buffered notification receives a priority score at ingestion time. The score combines a type weight (e.g., direct mention = 1.0, system alert = 0.9, marketing = 0.2) with a recency decay factor: score = type_weight * exp(-lambda * hours_since_event). Lambda controls decay speed. At digest assembly, items are sorted descending by score. The top N items are rendered in full; remaining items are collapsed into a count summary. This ensures the digest always leads with the most relevant content regardless of arrival order.

Timezone-Aware Scheduling

The delivery scheduler runs every minute and queries for users whose local delivery time falls within the current UTC minute. Rather than querying millions of rows, the service precomputes a delivery UTC timestamp for every user at midnight UTC and stores it in a delivery index table. The query becomes a simple range scan: WHERE next_delivery_utc BETWEEN NOW() AND NOW() + INTERVAL 1 MINUTE. After delivery, the next UTC timestamp is computed by converting the user local delivery time for the next window to UTC and updating the index row.

Aggregation and Assembly

When a user digest fires, the service fetches all undelivered buffer rows for that user/channel, scores and sorts them, then renders a template using the top items. It writes a digest_deliveries row with an idempotency key of {user_id}:{channel}:{window_start} before dispatching. If the dispatch fails and retries, the idempotency key prevents a second delivery row and a duplicate send. After successful dispatch, buffer rows are marked delivered = TRUE in a single batch UPDATE.

API Design

POST /v1/digest/buffer — internal; add a notification to the digest buffer for a user.
GET /v1/digest/preview/{userId} — returns what the next digest would contain; used in settings UI.
PUT /v1/digest/schedule/{userId}/{channel} — set delivery time and window for a user.
POST /v1/digest/deliver/{userId}/{channel} — force immediate delivery; used for testing or on-demand flush.
GET /v1/digest/history/{userId} — paginated list of past digest deliveries with item counts.

Scalability and Fault Tolerance

The delivery index table shards by user ID. The scheduler runs as multiple parallel workers, each owning a shard range. A distributed lock (Redis SETNX with TTL) prevents two workers from firing the same user digest simultaneously. The buffer table is partitioned by buffered_at date; old partitions are dropped after 30 days, keeping the table size bounded even for users who never read their digests.

For users with very large buffers (thousands of notifications per day), the assembly step caps the fetch at 500 rows ordered by priority_score DESC before rendering, bounding memory usage and rendering time.

Interview Tips

Clarify whether a user who unsubscribes mid-window should receive the digest for already-buffered items — safest answer is no, check opt-out at delivery time, not buffer time.
Discuss empty digest suppression: skip delivery if the buffer has zero items to avoid confusing blank emails.
Mention that the preview API is valuable for product trust — users can see what they will receive before committing to digest mode.
The priority decay function lambda should be tunable per notification type; a breaking-news alert decays slower than a social like.

{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “How is aggregation window configuration handled in a digest scheduler?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Each digest job carries a configurable window (e.g., last 24 h, last 7 days) stored as an (anchor_time, duration) pair. The scheduler resolves the window at execution time relative to the job's scheduled_at timestamp so late runs don't silently drop or double-count events. Window boundaries are stored in UTC and converted to the user's timezone only for display, keeping aggregation logic timezone-neutral.”
}
},
{
“@type”: “Question”,
“name”: “How is content priority scoring implemented for digest emails?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “A scoring function assigns each candidate item a numeric score based on signals such as recency (exponential decay), engagement rate of similar items, explicit user interest tags, and notification importance tier. Items are ranked by score and the top-N are selected up to a byte budget. Scores can be precomputed and stored during ingestion so the digest assembly step is a simple sorted read rather than an online ML inference call.”
}
},
{
“@type”: “Question”,
“name”: “How does a digest scheduler handle timezone-aware delivery timing?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Each user record stores a preferred delivery time (HH:MM) and an IANA timezone string. The scheduler converts that to a UTC epoch for the next occurrence at job-creation time and stores it as next_run_utc. A global cron wakes every minute, queries for jobs where next_run_utc <= now(), enqueues them, and advances next_run_utc by the recurrence interval using the user's timezone to handle DST transitions correctly rather than adding a fixed offset."
}
},
{
"@type": "Question",
"name": "How should a digest scheduler handle unsubscribe requests?",
"acceptedAnswer": {
"@type": "Answer",
"text": "Unsubscribe must be honored within the RFC 8058 one-click window (under 10 seconds) and must survive race conditions with in-flight jobs. The preferred pattern is a suppression list keyed by (user_id, channel): the scheduler checks the list immediately before enqueuing and skips the job if the user appears. Jobs already in the queue check the list again at send time. The suppression entry is created synchronously on unsubscribe and the List-Unsubscribe-Post header in every email points to the one-click endpoint."
}
}
]
}