Email Delivery Service Low-Level Design: SMTP Routing, Bounce Handling, and Reputation Management

Email Sending Flow

The end-to-end path from application event to recipient inbox:

  1. Application calls the email service API with recipient, template ID, and template variables
  2. Email service renders the HTML template with the provided variables
  3. Rendered email is validated (valid address format, content passes spam scoring)
  4. Email is routed to the appropriate ESP (Email Service Provider) based on email type and recipient domain
  5. ESP delivers via SMTP to the recipient's MX server
  6. Delivery status (accepted, bounced, deferred) returned asynchronously via webhook

ESP Selection and Routing

Using a single ESP for all email creates a single point of failure and mixes reputations. Route by email type:

  • Transactional email (password reset, order confirmation, 2FA codes): route through primary ESP (SES or SendGrid). These must arrive promptly and reliably — user is actively waiting.
  • Marketing email (newsletters, promotions): route through secondary ESP (Mailgun, Brevo). Reputation of this stream is inherently lower — don't let it contaminate transactional deliverability.
  • Failover: if the primary ESP returns a 5xx error, retry via the secondary. Implement circuit breaker — if primary fails 10 consecutive sends, route to secondary until primary recovers.

SMTP Reputation Management

Inbox placement depends heavily on the reputation of the sending IP address and domain:

  • Dedicated IPs for transactional email: shared IPs mean your deliverability is affected by other senders on the same IP. Dedicated IPs are under your control.
  • Separate IP pools: use different IP pools for transactional vs. marketing. A complaint spike from a marketing campaign does not penalize transactional IPs.
  • Complaint rate: Gmail's Postmaster Tools reports complaint rate. Above 0.1% triggers inbox filtering; above 0.3% can result in bulk blocking. Monitor continuously.

Bounce Handling

Not every sent email reaches an inbox. Bounces fall into two categories:

  • Hard bounce: permanent delivery failure — address does not exist, domain does not exist, or the receiving server permanently rejected the address. Action: immediately set email_valid = false in the user record and add to the suppression list. Never retry a hard-bounced address.
  • Soft bounce: temporary failure — mailbox full, receiving server temporarily unavailable, message too large. Action: retry with exponential backoff (1 hour, 4 hours, 24 hours); after 72 hours without delivery, treat as hard bounce.

ESPs deliver bounce notifications via webhook (SNS → Lambda → DB update) or via SMTP feedback loops. Process these in near-real-time — sending to invalid addresses at high volume is the fastest way to destroy IP reputation.

Complaint Handling

When a recipient marks a message as spam in their email client, the provider forwards a complaint notification via Feedback Loop (FBL):

  • Parse the complaint notification to extract the recipient's email address
  • Add the address to the suppression list with reason complaint
  • Stop all marketing emails to this address immediately — transactional emails (security alerts, receipts) may continue if the user has a valid account

Suppression List

The suppression list is the most critical data structure in the email service. Before every send, regardless of caller or email type, check the suppression list:

  • hard_bounce — address permanently invalid
  • complaint — user marked as spam
  • unsubscribe — user clicked unsubscribe link (CAN-SPAM / GDPR legal requirement)
  • manual — support team suppressed for other reasons

Store suppression list in a hash set in Redis for O(1) lookup. Persist in the DB as the source of truth. A suppressed address that receives an email is a compliance violation and a deliverability risk — this check is non-negotiable.

SPF, DKIM, and DMARC

Three DNS-based mechanisms authenticate outgoing email and protect against spoofing:

  • SPF (Sender Policy Framework): a DNS TXT record lists the IP addresses authorized to send email from your domain. Receiving servers check whether the sending IP is in the list. Mismatches are flagged as suspicious. Example: v=spf1 include:amazonses.com ~all
  • DKIM (DomainKeys Identified Mail): the ESP signs outgoing email headers with a private key. The corresponding public key is published in DNS. The receiving server verifies the signature — proves the email was not modified in transit and originated from an authorized sender.
  • DMARC (Domain-based Message Authentication, Reporting, and Conformance): a policy published in DNS that tells receiving servers what to do with emails that fail SPF or DKIM: p=none (monitor only), p=quarantine (send to spam folder), p=reject (reject outright). DMARC also enables aggregate reporting — daily XML reports showing authentication pass/fail rates across all receiving domains.

All three must be configured correctly before sending at volume. Gmail and Yahoo now require DMARC alignment for bulk senders.

Email Template Rendering

Templates are stored in a template service with versioning. Rendering pipeline:

  • Template engine: Jinja2 (Python), Handlebars (Node.js), or Twig (PHP) renders HTML with provided variables
  • CSS inlining: email clients do not support <style> tags reliably. Tools like Premailer convert CSS rules to inline style attributes before sending.
  • Plain text fallback: always include a text/plain MIME part — improves deliverability and serves users who prefer plain text email clients
  • Cross-client testing: use Litmus or Email on Acid to render previews across 90+ email clients (Gmail, Outlook 2019, Apple Mail, Yahoo) before deploying a new template

IP Warm-Up and Deliverability Monitoring

IP warm-up: a new sending IP has no reputation. Inbox providers treat mail from unknown IPs with suspicion. Warm up gradually:

  • Day 1: 100 emails to your most engaged users (low unsubscribe/complaint rate)
  • Double volume each week: 200 → 400 → 800 → …
  • Full volume achieved after 4–6 weeks

Sending large volume immediately from a new IP results in bulk filtering or blacklisting that can take weeks to reverse.

Deliverability monitoring:

  • Gmail Postmaster Tools: complaint rate, domain reputation, delivery errors — free, requires DNS verification
  • MX Toolbox: check IP against 100+ blacklists
  • Seed testing: send to seed inboxes at major providers (Gmail, Outlook, Yahoo) and measure inbox vs. spam placement rate before major campaigns

{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “How does SMTP routing work in a transactional email service?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Outbound mail is routed through dedicated IP pools segmented by email type (transactional vs. marketing); a reputation manager monitors per-IP bounce and complaint rates and shifts volume to healthy IPs automatically.”
}
},
{
“@type”: “Question”,
“name”: “How are bounces and complaints processed?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “ISPs return bounce notifications (hard/soft) and FBL complaint reports via SMTP DSN or feedback loops; a processor classifies each, suppresses hard-bounced addresses, and increments per-sender reputation counters.”
}
},
{
“@type”: “Question”,
“name”: “How is email deliverability maintained at scale?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “SPF, DKIM signing, and DMARC policies authenticate each message; dedicated IP warm-up schedules ramp new IPs gradually; volume is throttled per destination ISP to respect their receiving limits.”
}
},
{
“@type”: “Question”,
“name”: “How are unsubscribes handled reliably?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “A suppression list table stores unsubscribed addresses; List-Unsubscribe headers with one-click POST support are injected into every outbound message; pre-send checks query the suppression list before delivery.”
}
}
]
}

See also: Netflix Interview Guide 2026: Streaming Architecture, Recommendation Systems, and Engineering Excellence

See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering

See also: Stripe Interview Guide 2026: Process, Bug Bash Round, and Payment Systems

See also: Shopify Interview Guide

See also: Atlassian Interview Guide

Scroll to Top