A bulk email system sends millions of marketing or transactional emails to user segments, tracking delivery, opens, clicks, and unsubscribes. The core engineering challenges are not about sending email — any SMTP client can do that — but about throughput (sending 10M emails without overwhelming the sending infrastructure), deliverability (not getting blacklisted), and compliance (honoring unsubscribes immediately and accurately).
Core Data Model
CREATE TABLE Campaign (
campaign_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name VARCHAR(255) NOT NULL,
subject VARCHAR(500) NOT NULL,
html_body TEXT NOT NULL,
text_body TEXT,
from_email VARCHAR(255) NOT NULL,
status VARCHAR(20) NOT NULL DEFAULT 'draft',
-- draft, scheduled, sending, paused, completed, cancelled
scheduled_at TIMESTAMPTZ,
segment_query JSONB NOT NULL, -- criteria for selecting recipients
total_recipients INT,
sent_count INT NOT NULL DEFAULT 0,
created_at TIMESTAMPTZ DEFAULT NOW()
);
CREATE TABLE CampaignRecipient (
id BIGSERIAL PRIMARY KEY,
campaign_id UUID NOT NULL REFERENCES Campaign(campaign_id),
user_id BIGINT NOT NULL,
email VARCHAR(255) NOT NULL,
status VARCHAR(20) NOT NULL DEFAULT 'pending',
-- pending, sent, bounced, failed
sent_at TIMESTAMPTZ,
tracking_id UUID NOT NULL DEFAULT gen_random_uuid() -- for open/click tracking
);
CREATE INDEX idx_cr_campaign_status ON CampaignRecipient(campaign_id, status)
WHERE status = 'pending';
CREATE TABLE EmailEvent (
event_id BIGSERIAL PRIMARY KEY,
tracking_id UUID NOT NULL,
event_type VARCHAR(20) NOT NULL, -- delivered, opened, clicked, bounced, unsubscribed
metadata JSONB,
occurred_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE TABLE Unsubscribe (
email VARCHAR(255) PRIMARY KEY,
user_id BIGINT,
reason VARCHAR(100),
created_at TIMESTAMPTZ DEFAULT NOW()
);
Recipient Materialization and Sending Pipeline
def start_campaign(campaign_id: str):
campaign = db.fetchone("SELECT * FROM Campaign WHERE campaign_id=%s", [campaign_id])
# Step 1: Materialize recipients (runs once at campaign start)
# Evaluate the segment_query to get the list of recipients
users = evaluate_segment(campaign['segment_query'])
# Exclude unsubscribes and bounced addresses
unsubscribed = set(db.fetchall_scalar(
"SELECT email FROM Unsubscribe"
))
hard_bounced = set(db.fetchall_scalar(
"SELECT DISTINCT email FROM EmailEvent WHERE event_type='bounced'"
))
eligible = [u for u in users if u['email'] not in unsubscribed
and u['email'] not in hard_bounced]
# Insert recipient rows in batches
for batch in chunk(eligible, 1000):
db.executemany("""
INSERT INTO CampaignRecipient (campaign_id, user_id, email)
VALUES (%s, %s, %s)
""", [(campaign_id, u['id'], u['email']) for u in batch])
db.execute("""
UPDATE Campaign SET status='sending', total_recipients=%s
WHERE campaign_id=%s
""", [len(eligible), campaign_id])
# Enqueue the send job
send_queue.enqueue('send_campaign_batch', campaign_id=campaign_id)
Throttled Sending Worker
def send_campaign_batch(campaign_id: str, batch_size: int = 100):
"""
Called repeatedly until all recipients are sent.
Throttled to respect ESP rate limits and avoid spam triggers.
"""
MAX_SENDS_PER_SECOND = 50 # respect ESP (SendGrid/SES) rate limits
while True:
recipients = db.fetchall("""
SELECT id, email, tracking_id FROM CampaignRecipient
WHERE campaign_id = %s AND status = 'pending'
ORDER BY id ASC
LIMIT %s
FOR UPDATE SKIP LOCKED
""", [campaign_id, batch_size])
if not recipients:
db.execute("UPDATE Campaign SET status='completed' WHERE campaign_id=%s",
[campaign_id])
break
# Re-check unsubscribes for late opt-outs
emails = [r['email'] for r in recipients]
unsubscribed = set(db.fetchall_scalar(
"SELECT email FROM Unsubscribe WHERE email = ANY(%s)", [emails]
))
for recipient in recipients:
if recipient['email'] in unsubscribed:
db.execute("UPDATE CampaignRecipient SET status='skipped' WHERE id=%s",
[recipient['id']])
continue
send_email(
to=recipient['email'],
campaign_id=campaign_id,
tracking_id=recipient['tracking_id']
)
db.execute("UPDATE CampaignRecipient SET status='sent', sent_at=NOW() WHERE id=%s",
[recipient['id']])
time.sleep(batch_size / MAX_SENDS_PER_SECOND) # throttle
Open and Click Tracking
# Open tracking: embed a 1x1 pixel image in the HTML email
#
# Click tracking: rewrite all links through a redirect service
# Original: https://example.com/product
# Rewritten: https://track.example.com/click/{tracking_id}?url=https://example.com/product
def handle_tracking_event(event_type: str, tracking_id: str, metadata: dict = None):
db.execute("""
INSERT INTO EmailEvent (tracking_id, event_type, metadata)
VALUES (%s, %s, %s)
""", [tracking_id, event_type, json.dumps(metadata or {})])
# For unsubscribe clicks: add to Unsubscribe table immediately
if event_type == 'unsubscribed':
recipient = db.fetchone(
"SELECT email, user_id FROM CampaignRecipient WHERE tracking_id=%s",
[tracking_id]
)
if recipient:
db.execute("""
INSERT INTO Unsubscribe (email, user_id, reason)
VALUES (%s, %s, 'user_request')
ON CONFLICT (email) DO NOTHING
""", [recipient['email'], recipient['user_id']])
Key Interview Points
- Materialize recipients at campaign start, not at send time — the segment query result may change during a multi-hour send. The recipient list must be a snapshot.
- Check unsubscribes again at send time (not just at materialization) — users may unsubscribe after the campaign starts. Sending to a recently-unsubscribed user is a CAN-SPAM/GDPR violation.
- Hard bounces must be permanently suppressed — re-sending to bounced addresses damages your sender reputation and can get your domain blacklisted by ISPs.
- Throttle sending to stay within ESP limits (SES: 14/second on sandbox, up to thousands/second on production; SendGrid: configurable). Exceeding limits triggers 429 errors and delays.
- Deduplicate recipients: a user with two email addresses who matches the segment twice should receive the email once. Deduplicate on email address before materialization.
- The tracking_id links events back to a specific campaign send — never use user_id or email directly in tracking URLs (PII exposure in web server logs).
{“@context”:”https://schema.org”,”@type”:”FAQPage”,”mainEntity”:[{“@type”:”Question”,”name”:”Why must campaign recipients be materialized at send time rather than evaluated continuously?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”The segment query (e.g., "all users who signed up more than 30 days ago") returns a different result set every minute. If the recipient list is evaluated dynamically during a multi-hour send, users who join the segment mid-send will receive the email, and users who were in the segment at campaign start but left it mid-send may be skipped or double-counted. Materializing recipients at campaign start creates a snapshot: a fixed list of user_id + email pairs that does not change regardless of how long sending takes. This makes the send deterministic, auditable, and resumable — if the send pauses, the pending recipient rows are still there to pick up from.”}},{“@type”:”Question”,”name”:”How do you handle late unsubscribes during a multi-hour email send?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Materialize the unsubscribe list at campaign start (excluding known unsubscribers from the CampaignRecipient table). But also re-check the Unsubscribe table for each recipient batch at send time: SELECT email FROM Unsubscribe WHERE email = ANY(:batch_emails). This catches users who unsubscribed after materialization but before their batch was sent. Mark those recipients as status=’skipped’ rather than ‘sent’. This two-check approach satisfies CAN-SPAM and GDPR requirements: unsubscribes must be honored within 10 business days under CAN-SPAM, but best practice is to honor within minutes. Sending to a post-unsubscribe email is a legal violation regardless of when the campaign was scheduled.”}},{“@type”:”Question”,”name”:”How do you track email opens without compromising user privacy?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Open tracking works by embedding a 1×1 transparent pixel image in the HTML email body: <img src="https://track.example.com/open/{tracking_id}">. When the email client loads images, it fetches the pixel, hitting your tracking endpoint which logs the open event. Privacy considerations: (1) iOS Mail Privacy Protection (iOS 15+) pre-fetches all images in emails, making every email appear opened regardless of user action — open rates from iOS users are inflated and unreliable. (2) Many email clients block image loading by default — a significant percentage of opens go untracked. Use click tracking (more reliable, requires user action) as the primary engagement signal, and treat open rates as directional rather than exact metrics.”}},{“@type”:”Question”,”name”:”How do you prevent getting blacklisted while sending millions of emails?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Deliverability best practices: (1) Warm up IP addresses gradually — new sending IPs should start at 100 emails/day, doubling weekly until reaching full volume. ISPs flag sudden high volume from new IPs as spam. (2) Maintain sender reputation — monitor bounce rates (<2%), spam complaint rates (<0.1%). High bounce rates indicate dirty lists; high complaint rates trigger ISP blacklisting. (3) Hard bounce suppression — permanently remove bounced addresses and never send to them again. (4) SPF, DKIM, DMARC authentication — authenticate your sending domain so ISPs can verify you are the legitimate sender. (5) List hygiene — remove inactive addresses (no opens or clicks in 6 months) before large sends. (6) Unsubscribe link in every email — required by CAN-SPAM, and missing it increases complaint rates.”}},{“@type”:”Question”,”name”:”How do you resume a paused or failed campaign send without duplicate sends?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”The CampaignRecipient table tracks per-recipient status: pending, sent, skipped, failed. The sending worker queries WHERE status = ‘pending’ and processes only those rows. If the campaign is paused mid-send, pending rows remain in the table. On resume, the worker picks up exactly where it left off — already-sent recipients have status=’sent’ and are excluded. If a batch worker crashes mid-batch after sending some emails but before updating status to ‘sent’, those recipients may receive a duplicate on retry. Mitigate: update status to ‘sent’ atomically using FOR UPDATE SKIP LOCKED within the same transaction as the send (mark as ‘sending’ before the API call, then ‘sent’ after). Idempotency keys with the ESP API (SendGrid/SES) prevent duplicate delivery even on retry.”}}]}
Bulk email campaign system and marketing automation design is discussed in Shopify system design interview questions.
Bulk email campaign and user engagement system design is covered in LinkedIn system design interview preparation.
Bulk email campaign and host communication system design is discussed in Airbnb system design interview guide.