What Is a Thumbnail Generator Service?
A thumbnail generator service automatically produces small preview images from source assets — videos, documents, and images — and makes them available via a fast CDN-backed URL. Thumbnails are read-heavy but write-once: generated once on demand or on upload, then served millions of times. The design must optimize for low generation latency, high cache hit rates, and graceful degradation when source assets are unavailable.
Data Model
CREATE TABLE thumbnails (
id BIGINT PRIMARY KEY AUTO_INCREMENT,
asset_type ENUM('image', 'video', 'document') NOT NULL,
asset_id BIGINT NOT NULL,
profile VARCHAR(64) NOT NULL, -- e.g. 200x200_crop, 400x300_fit
source_key VARCHAR(512) NOT NULL, -- original asset location
output_key VARCHAR(512), -- generated thumbnail location
status ENUM('pending', 'done', 'failed') DEFAULT 'pending',
attempt_count TINYINT DEFAULT 0,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
UNIQUE KEY uq_asset_profile (asset_type, asset_id, profile)
);
CREATE TABLE thumbnail_profiles (
name VARCHAR(64) PRIMARY KEY,
width INT NOT NULL,
height INT NOT NULL,
fit_mode ENUM('crop', 'contain', 'cover') NOT NULL,
format ENUM('jpeg', 'webp', 'png') NOT NULL,
quality TINYINT DEFAULT 85,
time_offset FLOAT DEFAULT 0.0 -- for video: seconds into the clip
);
Core Workflow
Thumbnail generation can be triggered two ways: eagerly at upload time, or lazily on first request. A hybrid approach works best in practice:
- Eager path: when an asset is uploaded, the upload service publishes an event. The thumbnail service subscribes, looks up all configured profiles for that asset type, inserts
thumbnailsrows, and enqueues generation jobs. This pre-warms the cache before users request the asset. - Lazy path: a thumbnail URL request arrives at the API gateway for a combination that has not been generated yet (e.g., a new profile added after upload). The API checks the
thumbnailstable — if missing or pending, it enqueues a high-priority job and returns a placeholder image with aRetry-Afterheader. The client polls until the URL becomes valid. - Generator Worker: pulls a job from the queue, downloads the source asset from object storage, extracts the frame (for video, seeks to
time_offsetusing FFmpeg), applies the profile transformations using libvips, uploads the result, and updates thethumbnailsrow todone. - CDN layer: thumbnail URLs map to a CDN in front of object storage. Cache-Control headers are set to long TTLs (e.g.,
max-age=31536000, immutable) because output keys include a hash of the profile + source, making them content-addressed and safe to cache indefinitely.
Failure Handling and Retry Logic
- Source unavailable: if the source asset cannot be downloaded (404 or storage outage), the job is retried with backoff up to 5 times. After all retries the row is marked
failedand a fallback placeholder URL is stored inoutput_keyso the API always has something to return. - Corrupt source: decoding errors are treated as permanent failures. The job goes to DLQ immediately; retrying a corrupt file is wasteful.
- Idempotency: the
UNIQUE KEYon(asset_type, asset_id, profile)prevents duplicate rows if the event is delivered more than once. Workers check for an existingdonerow before processing and skip if found. - Partial profile failure: each profile is an independent job. Failure of the 400×300 variant does not block delivery of the 200×200 variant.
Scalability Considerations
- Caching at the API layer: cache
thumbnailslookups in Redis with a short TTL (30 seconds) for pending items and a long TTL (24 hours) for done items. This avoids database reads on every URL resolution request under high load. - Read-through CDN: serve thumbnails directly from CDN; origin (object storage) is only hit on cache miss. At scale, origin hit rates drop below 0.1%.
- Profile versioning: when a profile definition changes, bump a version suffix in the output key (e.g.,
thumb_200x200_v2). Old cached URLs remain valid; new requests generate against the updated profile without invalidating CDN at scale. - Worker sizing: video frame extraction is more expensive than image resizing. Maintain separate queues — one for video-source jobs (heavier workers) and one for image/document jobs (lighter workers) — to prevent video jobs from starving simpler work.
- Batch pre-generation: when a new profile is added, a backfill job iterates all existing assets and enqueues generation. Rate-limit the backfill queue to avoid starving real-time jobs.
Summary
A thumbnail generator service is a good interview topic because it blends write-once semantics, queue-based async processing, and read-heavy CDN delivery. The key design decisions are: eager vs. lazy generation, content-addressed output keys for safe long-term caching, and profile-level job isolation for independent retry. Be prepared to discuss how you would handle a sudden profile change across billions of existing assets — versioned keys and background backfill are the standard answer.
{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “What is the overall design of a thumbnail generation service?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “A thumbnail generator service receives requests (either synchronously via API or asynchronously via a queue), fetches or receives the source media (image or video frame), applies resizing and cropping transformations to produce standardized thumbnail dimensions, stores the output in object storage, and returns or caches the thumbnail URL. A CDN sits in front for high-volume read traffic.”
}
},
{
“@type”: “Question”,
“name”: “How do you avoid redundant thumbnail generation for the same source?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Use a deterministic key derived from the source media identifier and transformation parameters (dimensions, crop mode, quality) to check a cache or object storage before processing. If the key already exists, return the cached URL immediately. This content-addressable approach eliminates duplicate work and is especially important at the scale of platforms like Snap or Meta.”
}
},
{
“@type”: “Question”,
“name”: “How do you extract a representative frame from a video for thumbnail generation?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Common approaches include extracting the frame at a fixed timestamp offset (e.g., 10% into the video), selecting the frame with the highest perceptual quality score using heuristics (sharpness, brightness), or using ML-based models to identify visually appealing or semantically meaningful frames. The chosen frame is then passed through the standard image resize/crop pipeline.”
}
},
{
“@type”: “Question”,
“name”: “How do you scale thumbnail generation to handle millions of uploads per day?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Scale by processing thumbnail jobs asynchronously via a distributed message queue so the upload path is non-blocking. Use auto-scaling worker pools sized to queue depth. For read-heavy workloads, serve thumbnails exclusively through a CDN with long cache TTLs. Lazy generation (generate on first request, then cache) can also be effective for long-tail content that may never be viewed.”
}
}
]
}
See also: Meta Interview Guide 2026: Facebook, Instagram, WhatsApp Engineering
See also: Anthropic Interview Guide 2026: Process, Questions, and AI Safety
See also: Snap Interview Guide