Low Level Design: Digest Email Service

What Is a Digest Email Service?

A Digest Email Service batches individual notification events that would otherwise generate separate emails into a single, well-formatted summary delivered on a schedule the user controls (daily, weekly, or custom cadence). It trades immediacy for readability: instead of thirty individual comment-notification emails from a busy day, the user gets one email with a ranked, deduplicated list. The service must collect events, group and rank them, render a template, and deliver exactly once even under retries and worker crashes.

Data Model

-- One row per event that should appear in a future digest
CREATE TABLE digest_events (
    id              BIGINT PRIMARY KEY AUTO_INCREMENT,
    user_id         BIGINT       NOT NULL,
    category        VARCHAR(64)  NOT NULL,   -- e.g. 'comment', 'like', 'mention'
    entity_type     VARCHAR(64)  NOT NULL,   -- e.g. 'post', 'thread'
    entity_id       BIGINT       NOT NULL,
    actor_id        BIGINT,                  -- who triggered the event
    payload         JSON         NOT NULL,   -- arbitrary metadata for template rendering
    created_at      TIMESTAMP    NOT NULL DEFAULT CURRENT_TIMESTAMP,
    digest_run_id   BIGINT,                  -- set when included in a sent digest
    INDEX idx_user_unsent (user_id, digest_run_id),
    INDEX idx_created (created_at)
);

-- One row per digest delivery attempt
CREATE TABLE digest_runs (
    id              BIGINT PRIMARY KEY AUTO_INCREMENT,
    user_id         BIGINT      NOT NULL,
    frequency       ENUM('daily','weekly') NOT NULL,
    scheduled_for   TIMESTAMP   NOT NULL,
    started_at      TIMESTAMP,
    completed_at    TIMESTAMP,
    status          ENUM('pending','processing','sent','failed','skipped') NOT NULL DEFAULT 'pending',
    event_count     INT,
    INDEX idx_scheduled (scheduled_for, status)
);

Core Algorithm: Building and Sending a Digest

A scheduler (cron or event-driven clock) creates digest_runs rows in bulk at the start of each frequency period. A pool of workers then processes them:

Claim a run. Worker issues SELECT ... WHERE status='pending' AND scheduled_for <= NOW() FOR UPDATE SKIP LOCKED LIMIT 1, sets status='processing' and started_at=NOW().
Load unsent events. Query digest_events WHERE user_id=? AND digest_run_id IS NULL ORDER BY created_at ASC. This captures everything since the last digest.
Skip if empty. If no events, set status='skipped'. No email is sent; the user is not bothered.
Group and rank. Group by (category, entity_type, entity_id) to deduplicate (e.g., ten likes on the same post become one line). Sort groups by event count descending so the most active items appear first.
Render template. Pass the ranked groups to a Mustache/Handlebars template that produces an HTML email. Inline CSS for email-client compatibility.
Send. Call the transactional email gateway (SES, SendGrid). On HTTP 200, proceed; on error, throw and let the worker retry.
Commit. In a single transaction: set digest_run_id on all included events, mark the run status='sent', record completed_at and event_count.

Failure Handling

Worker crash mid-run: The run stays in processing. A watchdog job queries for runs that have been in processing for more than N minutes and resets them to pending. Because events are only marked with digest_run_id after a confirmed send, a re-run picks up the same events and produces the same email, making the operation effectively idempotent.

Duplicate send risk: The gateway call and the DB commit are not atomic. If the send succeeds but the commit fails, the next retry will re-send. Mitigate by passing an idempotency key (digest_run_id) to the gateway; most providers deduplicate on this key within a 24-hour window.

Event volume spikes: If a user accumulates tens of thousands of events between digests, paginate the load in step 2 and cap the rendered count (e.g., top 50 items) with a footer link to the full activity feed.

User preference change mid-window: If a user switches from daily to weekly after a run is already scheduled, cancel the pending daily run and create a weekly one. A grace period (e.g., 15 minutes) avoids cancelling a run that is already processing.

Scalability Considerations

Thundering herd: All daily-frequency users share the same scheduled time. Stagger scheduled_for by hashing user_id % 3600 seconds within the hour to spread the load.

Worker parallelism: SKIP LOCKED allows any number of workers to pull independent runs without contention. Scale workers horizontally to keep the queue drain time well under the scheduling interval.

Template rendering cost: Rendering HTML for large event sets is CPU-bound. Offload to a dedicated render service or cache rendered fragments by group signature when the same entity appears in many users digests (common for popular posts).

Database growth: Archive or delete digest_events rows older than 90 days. The digest_run_id foreign key makes it easy to identify rows that have already been included in a sent digest and are safe to purge.

Summary

The Digest Email Service decouples event ingestion from delivery by writing raw events to a append-only table and processing them in scheduled, idempotent batch runs. Grouping and ranking logic keeps emails scannable even for high-volume users; SKIP LOCKED enables horizontal worker scaling; and the two-phase commit pattern (send then mark) combined with gateway idempotency keys prevents both missed and duplicate deliveries.

{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “What is a digest email service and why is it used?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “A digest email service batches individual events or notifications into a single periodic email (e.g., daily or weekly) rather than sending one email per event. This reduces inbox noise for users and lowers send volume and cost for the platform. It is commonly used for activity summaries, content recommendations, and social network update roll-ups.”
}
},
{
“@type”: “Question”,
“name”: “How do you aggregate events for digest emails at scale?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Events are written to a time-series or append-only store (e.g., a Kafka topic or a partitioned database table) keyed by user ID and event type. A scheduled job or stream processor reads all events for each user within the digest window, ranks or deduplicates them according to business rules, and assembles the digest payload. For very large user bases, the aggregation job is partitioned by user ID range and run in parallel.”
}
},
{
“@type”: “Question”,
“name”: “How do you handle user preferences and unsubscribes in a digest email system?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Before rendering and sending each digest, the service checks the user’s notification preferences to confirm they are opted into digest emails and to determine the desired frequency (daily, weekly, never). Unsubscribe links in the email must trigger a synchronous update to the preferences store and should take effect before the next scheduled digest run. Honor list-unsubscribe headers so email clients and ISPs can surface one-click unsubscribe.”
}
},
{
“@type”: “Question”,
“name”: “What are common failure modes in a digest email service and how do you handle them?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Common failure modes include: the aggregation job timing out for users with very large event histories (mitigate with per-user caps and pagination), the email provider being unavailable at send time (mitigate with a retry queue and exponential back-off), and duplicate sends caused by job restarts (mitigate with idempotency keys stored per user per digest window). Monitor bounce and complaint rates and suppress sending to addresses that hard-bounce.”
}
}
]
}