Low Level Design: Email Template Engine

Overview

An email template engine is a system for defining, versioning, rendering, and testing email content at scale. It must support variable substitution, multi-locale content, a rendering pipeline that produces valid HTML and plain-text alternatives, and A/B testing of template variants. The design must work for both transactional emails (order confirmation, password reset) sent individually and marketing emails sent to millions of recipients with personalized content.

Data Model

CREATE TABLE template_groups (
    id          INT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
    name        VARCHAR(128) NOT NULL,
    category    ENUM('transactional','marketing','notification') NOT NULL,
    owner_team  VARCHAR(64),
    created_at  DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
    UNIQUE KEY uq_name (name)
) ENGINE=InnoDB;

CREATE TABLE templates (
    id              INT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
    group_id        INT UNSIGNED NOT NULL,
    version         SMALLINT UNSIGNED NOT NULL DEFAULT 1,
    locale          VARCHAR(16) NOT NULL DEFAULT 'en',
    subject         VARCHAR(512) NOT NULL,
    html_body       MEDIUMTEXT NOT NULL,
    text_body       TEXT,
    variables       JSON COMMENT 'schema: {name: {type, required, default}}',
    layout_id       INT UNSIGNED COMMENT 'FK to layouts',
    is_active       TINYINT(1) NOT NULL DEFAULT 0,
    created_by      INT UNSIGNED,
    created_at      DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
    UNIQUE KEY uq_group_version_locale (group_id, version, locale),
    INDEX idx_group_active (group_id, locale, is_active)
) ENGINE=InnoDB;

CREATE TABLE layouts (
    id          INT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
    name        VARCHAR(128) NOT NULL,
    html_body   MEDIUMTEXT NOT NULL COMMENT 'contains {{content}} placeholder',
    text_body   TEXT,
    is_active   TINYINT(1) NOT NULL DEFAULT 1,
    UNIQUE KEY uq_name (name)
) ENGINE=InnoDB;

CREATE TABLE ab_experiments (
    id              INT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
    group_id        INT UNSIGNED NOT NULL,
    name            VARCHAR(128) NOT NULL,
    status          ENUM('draft','running','paused','concluded') NOT NULL DEFAULT 'draft',
    traffic_split   JSON NOT NULL COMMENT '{template_id: weight, ...}',
    winner_template_id INT UNSIGNED,
    started_at      DATETIME,
    concluded_at    DATETIME,
    created_at      DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
    INDEX idx_group (group_id, status)
) ENGINE=InnoDB;

CREATE TABLE render_cache (
    cache_key   VARCHAR(64) PRIMARY KEY COMMENT 'SHA256 of template_id+variables_json',
    rendered_html MEDIUMTEXT NOT NULL,
    rendered_text TEXT,
    expires_at  DATETIME NOT NULL,
    INDEX idx_expiry (expires_at)
) ENGINE=InnoDB;

CREATE TABLE send_log (
    id              BIGINT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
    template_id     INT UNSIGNED NOT NULL,
    experiment_id   INT UNSIGNED,
    recipient_email VARCHAR(256) NOT NULL,
    variables_hash  CHAR(16) COMMENT 'first 16 chars of SHA256 for analytics grouping',
    status          ENUM('sent','bounced','opened','clicked','unsubscribed') NOT NULL DEFAULT 'sent',
    sent_at         DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
    opened_at       DATETIME,
    INDEX idx_template (template_id, sent_at),
    INDEX idx_experiment (experiment_id, status)
) ENGINE=InnoDB;

template_groups is the logical unit (e.g., "order-confirmation"). A group has multiple templates — one per (version, locale) combination. layouts are reusable HTML shells (header, footer, brand styles) that wrap template content so brand updates apply globally without editing every template. ab_experiments maps a group to a weighted set of template variants for split testing. render_cache avoids re-rendering identical variable sets. send_log records per-send events and links back to experiment rows for statistical analysis.

Rendering Pipeline

Step 1: Template Resolution

Input: group name (or ID), locale, recipient context. Resolution order:

If an active A/B experiment exists for the group, select the template variant by consistent hashing on recipient_id (not random — ensures the same recipient always gets the same variant within an experiment).
Look up the active template for (group_id, locale). If not found, fall back to (group_id, 'en') as the default locale. Log the fallback for internationalization gap tracking.
If still not found, raise a TemplateNotFoundError — never silently send a blank email.

Step 2: Variable Validation

The templates.variables JSON column defines the schema: each variable has a type (string, number, boolean, array), a required flag, and an optional default. Before rendering, validate the caller-supplied variable map against the schema:

Missing required variables without defaults: raise ValidationError with the list of missing keys.
Type mismatches: coerce where safe (number to string) or raise for unsafe mismatches (string where array expected).
Unknown variables: pass through — do not reject unknown keys; templates evolve and callers may send extra context.
Apply defaults for missing optional variables.

Step 3: Rendering

Compute a cache key: SHA256(template_id + canonical JSON of resolved variables). Check render_cache. On a hit, return cached HTML and text. On a miss:

HTML rendering: Use a logic-less template engine (Mustache) or a logic-light one (Jinja2, Handlebars) to substitute variables into html_body. Logic-less is preferable for email — business logic in templates creates maintenance nightmares and security risks (server-side template injection).
Layout injection: If the template has a layout_id, load the layout's html_body and replace the {{content}} placeholder with the rendered html_body. This produces the final framed HTML.
CSS inlining: Email clients (Outlook, Gmail) strip <style> blocks. Use a CSS inliner (e.g., Premailer) to convert stylesheet rules to inline style attributes. This step is expensive — it is a key reason to cache the output.
Plain text generation: If text_body is defined, render it with variable substitution. If not, auto-generate from HTML using an HTML-to-text converter (strip tags, convert links to "Link Text [URL]" format). Always include a plain text alternative — required by RFC 2822 and many spam filters.
Image URL rewriting: Rewrite relative image URLs to absolute CDN URLs. Optionally add tracking query parameters to image URLs for open tracking (the image request signals an open event).
Link tracking: Rewrite href attributes to pass through a tracking redirect URL that logs the click event and redirects to the original destination.
Store the result in render_cache with expires_at = NOW() + 24 hours.

Step 4: Sending

Pass the rendered HTML, plain text, and subject (with variable substitution applied) to the email sending service (SES, SendGrid, Mailgun). Record a row in send_log with template_id, experiment_id (if applicable), and recipient_email.

Template Versioning

Versioning follows an immutable model: once a template version is published (is_active = 1), its html_body is never modified. Edits create a new version row (version N+1) in draft state. A review/approval workflow (manual or automated) transitions the new version to active and deactivates the previous version. This means:

You can always render historical versions for debugging ("what email did user X receive in January?").
A/B experiments reference specific version IDs — concluding an experiment does not break if the template was meanwhile updated.
Rolling back is just setting a previous version's is_active = 1 and the current version's is_active = 0.

The unique key on (group_id, version, locale) enforces that version numbers are per-group, not global — different locales of the same logical email share a version number but are independent rows.

A/B Testing

An experiment is a weighted assignment of recipients to template variants within the same group. Key design points:

Consistent hashing: Use hash(recipient_id + experiment_id) mod 100 to assign a bucket. Map buckets to variants by the traffic_split weights. This guarantees the same recipient always sees the same variant — critical for multi-email journeys and for not skewing open rate metrics with users who receive both variants.
Holdout group: Include a control variant (existing active template) in the traffic_split. Statistical comparison requires a control.
Metric collection: send_log.status tracks sent, opened, clicked, bounced. Open rate and click rate per template_id are the primary metrics. Query: SELECT template_id, COUNT(*) total, SUM(opened_at IS NOT NULL) opens FROM send_log WHERE experiment_id = X GROUP BY template_id.
Statistical significance: Do not conclude experiments based on raw rates alone. Apply a chi-squared test or a Bayesian model. The system should expose raw counts via an analytics API; significance computation belongs in a data science tool, not the template engine.
Winner promotion: When an experiment is concluded, set winner_template_id and update the group's active template to the winner. All future sends use the winner without the routing overhead.

Localization

Each (group, version, locale) is an independent template row. The rendering pipeline falls back to 'en' if the requested locale is missing.
Subject lines, body copy, and even image URLs (localized banners) can differ per locale.
Date, number, and currency formatting within templates should be handled by the variable substitution step — pass pre-formatted strings (e.g., "April 17, 2026" not a Unix timestamp) from the caller, or use locale-aware formatting helpers in the template engine.
RTL locales (Arabic, Hebrew) require an HTML attribute change (dir="rtl") and potentially different layout templates. Store RTL locales as separate layout_id references.
Translation workflow: export html_body for all group+version combinations to a translation management system (Phrase, Lokalise). Import translated strings as new locale rows. Version them alongside the source locale.

Key Design Decisions and Trade-offs

Logic-less vs. logic-full templates: Mustache prevents template authors from embedding business logic, which keeps templates safe and auditable. Handlebars adds helpers ({{#if}}, {{#each}}) that cover 90% of real-world conditional needs without enabling arbitrary code execution. Avoid Jinja2 or Twig in email templates unless you fully sandbox execution — server-side template injection via user-controlled variables is a critical vulnerability.
CSS inlining at render time vs. build time: Inlining at render time with caching is the practical choice. Build-time inlining requires a separate pipeline for every template change and does not work for variable-dependent styles.
Render cache granularity: Caching at the (template_id, variables) level is fine for transactional emails where many users receive the same template with different variables. For marketing emails where each recipient has unique personalization, cache hit rates are near zero — skip the cache for those use cases and accept the rendering cost.
Separate text_body vs. auto-generation: Auto-generated plain text is fast but often poor quality (table layouts become unreadable). For high-volume transactional emails, auto-generation is acceptable. For marketing emails, hand-crafted text_body improves deliverability and accessibility.

Failure Handling and Edge Cases

Missing variables at render time: Never render a template with empty substitutions for required variables. A password reset email with a blank reset link is worse than a failed send. Fail loudly and page the sending application.
Render cache poisoning: If a bug in the rendering pipeline produces malformed HTML and it is cached, all sends using that cache key will be broken. Include a version stamp in the cache key that is incremented on pipeline code deployments, effectively invalidating all cached entries on deploy.
Layout not found: If a layout_id is specified but the layout row is deleted or deactivated, the render must fail — do not silently render the template without the layout (which would produce un-styled HTML missing header/footer).
Concurrent template activation: Two admins activating different versions simultaneously could leave two active rows. Use a database transaction with a SELECT FOR UPDATE on the active row before deactivating it and activating the new version.
Large variable payloads: A marketing email may include a user's full order history (array of items) as a template variable. Cap the serialized variables JSON at 64KB at the intake layer. For larger payloads, require the template to reference a pre-rendered partial via a data fetch helper rather than inlining all data.
A/B experiment data skew: If one variant has a higher hard bounce rate (invalid emails), its open rate will appear artificially inflated. Exclude bounced sends from open rate denominators in experiment analysis queries.

Scalability Considerations

Marketing send volume: Rendering 10M unique emails at 1ms per render = 10,000 seconds single-threaded. Parallelize rendering across a worker pool. With 100 workers each doing 1ms renders and 1ms network overhead, 10M renders complete in ~200 seconds — acceptable for a bulk campaign.
Render cache for bulk sends: For a marketing email where the only personalization is first name and a product recommendation (low cardinality variables), pre-render all variants before the send starts and populate the cache. The send workers then hit cache on every call.
Template storage: MEDIUMTEXT supports up to 16MB per html_body. Complex email templates with embedded base64 images can approach this limit — enforce a 500KB limit on html_body at upload time and require images to be CDN-hosted URLs.
send_log growth: At 10M sends/day, send_log grows by 3.65B rows/year. Partition by sent_at (monthly). Stream send_log inserts and status updates to a data warehouse (BigQuery, Redshift) for analytics, keeping only 30 days in MySQL for operational queries.
CDN for rendered output: For campaigns where all recipients receive identical content (no personalization), pre-render once and serve the HTML from a CDN URL embedded in the email. This shifts bandwidth from the rendering service to the CDN on email opens.

Summary

An email template engine is more than a string interpolation library — it is a versioned content management system with a multi-stage rendering pipeline, a locale fallback chain, a render cache, and an A/B testing framework all working together. The immutable versioning model is the foundation: it makes rollbacks safe, keeps experiment results reproducible, and provides an audit trail for compliance. The rendering pipeline's most expensive steps (CSS inlining, HTML-to-text conversion) make caching essential for high-volume sends. The A/B testing design must prioritize consistent recipient assignment and clean metric collection over algorithmic sophistication — the statistical analysis belongs downstream, not inside the engine.

{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “What is an email template engine and why is it useful?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “An email template engine separates email content and layout from the application code that triggers sends. Designers and marketers can edit templates (HTML + plain-text) independently of engineers, while the engine handles variable substitution, conditional blocks, and localization at render time. This separation reduces deployment friction for content changes, enables non-technical stakeholders to iterate on copy, and centralizes concerns like unsubscribe footers, brand headers, and accessibility markup so they don’t have to be duplicated across every email type.”
}
},
{
“@type”: “Question”,
“name”: “How is variable substitution and template rendering done safely?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Templates are stored as text with typed placeholders (e.g., {{user.first_name}}). At render time the engine receives a data context object, validates that required variables are present and match expected types, then performs substitution using a sandboxed renderer—never raw string interpolation—to prevent injection attacks. All user-supplied values are HTML-escaped before insertion into HTML parts; plain-text parts receive no HTML escaping but strip any markup. Logic constructs (loops, conditionals) are restricted to a whitelist of safe operations, and template execution runs with a timeout and memory cap to prevent resource exhaustion from malformed templates.”
}
},
{
“@type”: “Question”,
“name”: “How does template versioning and rollback work?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Every save to a template creates an immutable version record containing the full template body, metadata (author, timestamp, change note), and a version number. The active version pointer for each template is stored separately and updated atomically. Rollback is a two-step operation: a user selects a previous version in the UI, and the system updates the active pointer to that version’s ID without modifying any version records. Scheduled or in-flight sends always resolve the active version at queue time, so a rollback takes effect immediately for future sends without affecting messages already rendered and queued.”
}
},
{
“@type”: “Question”,
“name”: “How are A/B tests run on email templates?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “An A/B test is configured as an experiment attached to a template, defining two or more variant versions and a traffic split (e.g., 50/50). When a send is triggered, the engine’s experiment layer hashes a stable identifier (such as recipient user ID concatenated with experiment ID) to assign the recipient to a variant deterministically, ensuring the same user always sees the same variant if re-sent. Each rendered email is tagged with the variant ID in a tracking header. Delivery, open, and click events are attributed to the variant via that tag. An experiment dashboard aggregates metrics per variant and runs a statistical significance test; when significance is reached, the winning variant is promoted to the default and the experiment is closed.”
}
}
]
}