Low Level Design: Thumbnail Generator Service

What Is a Thumbnail Generator Service?

A thumbnail generator service automatically produces small preview images from source assets — videos, documents, and images — and makes them available via a fast CDN-backed URL. Thumbnails are read-heavy but write-once: generated once on demand or on upload, then served millions of times. The design must optimize for low generation latency, high cache hit rates, and graceful degradation when source assets are unavailable.

Data Model


CREATE TABLE thumbnails (
  id            BIGINT PRIMARY KEY AUTO_INCREMENT,
  asset_type    ENUM('image', 'video', 'document') NOT NULL,
  asset_id      BIGINT NOT NULL,
  profile       VARCHAR(64) NOT NULL,    -- e.g. 200x200_crop, 400x300_fit
  source_key    VARCHAR(512) NOT NULL,   -- original asset location
  output_key    VARCHAR(512),            -- generated thumbnail location
  status        ENUM('pending', 'done', 'failed') DEFAULT 'pending',
  attempt_count TINYINT DEFAULT 0,
  created_at    TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
  UNIQUE KEY uq_asset_profile (asset_type, asset_id, profile)
);

CREATE TABLE thumbnail_profiles (
  name          VARCHAR(64) PRIMARY KEY,
  width         INT NOT NULL,
  height        INT NOT NULL,
  fit_mode      ENUM('crop', 'contain', 'cover') NOT NULL,
  format        ENUM('jpeg', 'webp', 'png') NOT NULL,
  quality       TINYINT DEFAULT 85,
  time_offset   FLOAT DEFAULT 0.0       -- for video: seconds into the clip
);

Core Workflow

Thumbnail generation can be triggered two ways: eagerly at upload time, or lazily on first request. A hybrid approach works best in practice:

Eager path: when an asset is uploaded, the upload service publishes an event. The thumbnail service subscribes, looks up all configured profiles for that asset type, inserts thumbnails rows, and enqueues generation jobs. This pre-warms the cache before users request the asset.
Lazy path: a thumbnail URL request arrives at the API gateway for a combination that has not been generated yet (e.g., a new profile added after upload). The API checks the thumbnails table — if missing or pending, it enqueues a high-priority job and returns a placeholder image with a Retry-After header. The client polls until the URL becomes valid.
Generator Worker: pulls a job from the queue, downloads the source asset from object storage, extracts the frame (for video, seeks to time_offset using FFmpeg), applies the profile transformations using libvips, uploads the result, and updates the thumbnails row to done.
CDN layer: thumbnail URLs map to a CDN in front of object storage. Cache-Control headers are set to long TTLs (e.g., max-age=31536000, immutable) because output keys include a hash of the profile + source, making them content-addressed and safe to cache indefinitely.

Failure Handling and Retry Logic

Source unavailable: if the source asset cannot be downloaded (404 or storage outage), the job is retried with backoff up to 5 times. After all retries the row is marked failed and a fallback placeholder URL is stored in output_key so the API always has something to return.
Corrupt source: decoding errors are treated as permanent failures. The job goes to DLQ immediately; retrying a corrupt file is wasteful.
Idempotency: the UNIQUE KEY on (asset_type, asset_id, profile) prevents duplicate rows if the event is delivered more than once. Workers check for an existing done row before processing and skip if found.
Partial profile failure: each profile is an independent job. Failure of the 400×300 variant does not block delivery of the 200×200 variant.

Scalability Considerations

Caching at the API layer: cache thumbnails lookups in Redis with a short TTL (30 seconds) for pending items and a long TTL (24 hours) for done items. This avoids database reads on every URL resolution request under high load.
Read-through CDN: serve thumbnails directly from CDN; origin (object storage) is only hit on cache miss. At scale, origin hit rates drop below 0.1%.
Profile versioning: when a profile definition changes, bump a version suffix in the output key (e.g., thumb_200x200_v2). Old cached URLs remain valid; new requests generate against the updated profile without invalidating CDN at scale.
Worker sizing: video frame extraction is more expensive than image resizing. Maintain separate queues — one for video-source jobs (heavier workers) and one for image/document jobs (lighter workers) — to prevent video jobs from starving simpler work.
Batch pre-generation: when a new profile is added, a backfill job iterates all existing assets and enqueues generation. Rate-limit the backfill queue to avoid starving real-time jobs.

Summary

A thumbnail generator service is a good interview topic because it blends write-once semantics, queue-based async processing, and read-heavy CDN delivery. The key design decisions are: eager vs. lazy generation, content-addressed output keys for safe long-term caching, and profile-level job isolation for independent retry. Be prepared to discuss how you would handle a sudden profile change across billions of existing assets — versioned keys and background backfill are the standard answer.

{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “What is the overall design of a thumbnail generation service?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “A thumbnail generator service receives requests (either synchronously via API or asynchronously via a queue), fetches or receives the source media (image or video frame), applies resizing and cropping transformations to produce standardized thumbnail dimensions, stores the output in object storage, and returns or caches the thumbnail URL. A CDN sits in front for high-volume read traffic.”
}
},
{
“@type”: “Question”,
“name”: “How do you avoid redundant thumbnail generation for the same source?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Use a deterministic key derived from the source media identifier and transformation parameters (dimensions, crop mode, quality) to check a cache or object storage before processing. If the key already exists, return the cached URL immediately. This content-addressable approach eliminates duplicate work and is especially important at the scale of platforms like Snap or Meta.”
}
},
{
“@type”: “Question”,
“name”: “How do you extract a representative frame from a video for thumbnail generation?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Common approaches include extracting the frame at a fixed timestamp offset (e.g., 10% into the video), selecting the frame with the highest perceptual quality score using heuristics (sharpness, brightness), or using ML-based models to identify visually appealing or semantically meaningful frames. The chosen frame is then passed through the standard image resize/crop pipeline.”
}
},
{
“@type”: “Question”,
“name”: “How do you scale thumbnail generation to handle millions of uploads per day?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Scale by processing thumbnail jobs asynchronously via a distributed message queue so the upload path is non-blocking. Use auto-scaling worker pools sized to queue depth. For read-heavy workloads, serve thumbnails exclusively through a CDN with long cache TTLs. Lazy generation (generate on first request, then cache) can also be effective for long-tail content that may never be viewed.”
}
}
]
}