Image Resizing Service — Low-Level Design
An image resizing service generates and serves optimized image variants (thumbnails, responsive sizes, webp) on demand or on upload. It must handle high read volume, storage efficiency, and graceful handling of original source images. This design is asked at Instagram, Cloudflare, and any media-heavy platform.
Two Approaches: Eager vs. Lazy Resizing
Eager (pre-generate on upload):
When user uploads image → immediately generate all required variants
(thumbnail 100×100, card 400×300, hero 1200×800, webp versions)
Pros: zero latency serving; simple CDN caching
Cons: wasted work if many variants never get requested;
slow upload response waiting for all variants
Lazy (generate on first request):
Store only the original; generate variants on first request, then cache
Pros: only generates what is needed; fast upload
Cons: first request for any variant is slow (generation time)
Implementation: Nginx + image processing proxy (imgproxy, thumbor, Cloudflare Images)
Upload Pipeline (Eager Approach)
def handle_image_upload(user_id, file_bytes, filename):
# Validate
img = Image.open(io.BytesIO(file_bytes))
if img.format not in ('JPEG', 'PNG', 'GIF', 'WEBP'):
raise InvalidFormat()
if len(file_bytes) > 10 * 1024 * 1024: # 10MB limit
raise FileTooLarge()
# Strip EXIF metadata (privacy: GPS coordinates, device info)
img = strip_exif(img)
# Store original
image_id = generate_uuid()
original_key = f'images/{image_id}/original.jpg'
s3.put_object(Bucket='media-bucket', Key=original_key, Body=file_bytes)
# Queue variant generation
for variant in REQUIRED_VARIANTS:
enqueue_resize_job(image_id, original_key, variant)
db.insert(Image, {
'id': image_id,
'user_id': user_id,
'original_key': original_key,
'status': 'processing',
})
return image_id
REQUIRED_VARIANTS = [
{'name': 'thumb', 'width': 150, 'height': 150, 'crop': 'fill'},
{'name': 'card', 'width': 400, 'height': 300, 'crop': 'fit'},
{'name': 'hero', 'width': 1200, 'height': 630, 'crop': 'fit'},
{'name': 'avatar', 'width': 64, 'height': 64, 'crop': 'fill'},
]
Resize Worker
from PIL import Image
import io
def resize_image(image_id, original_key, variant):
# Download original from S3
obj = s3.get_object(Bucket='media-bucket', Key=original_key)
img = Image.open(io.BytesIO(obj['Body'].read()))
# Convert to RGB (PNGs with alpha channel can't be JPEG)
if img.mode in ('RGBA', 'LA', 'P'):
background = Image.new('RGB', img.size, (255, 255, 255))
background.paste(img, mask=img.split()[-1] if img.mode == 'RGBA' else None)
img = background
target_w, target_h = variant['width'], variant['height']
if variant['crop'] == 'fill':
# Smart crop: resize to cover, then center-crop
img = ImageOps.fit(img, (target_w, target_h), Image.LANCZOS)
else: # 'fit'
# Resize maintaining aspect ratio, no crop
img.thumbnail((target_w, target_h), Image.LANCZOS)
# Save as JPEG + WebP
for fmt, ext, quality in [('JPEG', 'jpg', 85), ('WEBP', 'webp', 80)]:
buf = io.BytesIO()
img.save(buf, format=fmt, quality=quality, optimize=True)
key = f'images/{image_id}/{variant["name"]}.{ext}'
s3.put_object(Bucket='media-bucket', Key=key,
Body=buf.getvalue(),
ContentType=f'image/{fmt.lower()}')
db.execute("""
INSERT INTO ImageVariant (image_id, variant_name, s3_key, width, height)
VALUES (%(iid)s, %(name)s, %(key)s, %(w)s, %(h)s)
""", {'iid': image_id, 'name': variant['name'],
'key': f'images/{image_id}/{variant["name"]}.jpg',
'w': target_w, 'h': target_h})
Serving via CDN
-- URL structure:
-- https://cdn.example.com/images/{image_id}/{variant}.{ext}
-- CDN origin: S3 bucket
-- CDN caches variants at edge nodes (CloudFront, Fastly, Cloudflare)
def get_image_url(image_id, variant='card', fmt='webp'):
return f'https://cdn.example.com/images/{image_id}/{variant}.{fmt}'
-- HTML: serve webp with JPEG fallback
-- Cache-Control headers on S3 objects:
-- Cache-Control: public, max-age=31536000, immutable
-- (one year; image variants never change once generated)
Key Interview Points
- Strip EXIF metadata on upload: Photos from mobile devices embed GPS coordinates, device model, and timestamp. Serving the original image leaks this data. Always strip EXIF before storing or serving user-uploaded images.
- Store variants as immutable files: Once generated, a variant never changes. Use content-addressed storage (image_id/variant.ext) and set Cache-Control: immutable so CDNs and browsers cache indefinitely.
- WebP reduces bandwidth 25-35% vs JPEG: Always generate WebP alongside JPEG. Use the HTML picture element with source type=”image/webp” for browsers that support it, falling back to JPEG for those that don’t.
- Async generation, sync serving: Variant generation takes 100-500ms per image. Never block the upload HTTP response waiting for all variants. Generate asynchronously; serve a placeholder until variants are ready, then switch to the CDN URL.
{“@context”:”https://schema.org”,”@type”:”FAQPage”,”mainEntity”:[{“@type”:”Question”,”name”:”Why should image resizing happen on-demand rather than eagerly at upload time?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Eager resizing generates every variant at upload time: thumbnail (150×150), small (400px), medium (800px), large (1200px), 2x retina for each = 8 variants. For 10 million images: 80 million S3 objects, 80 million resize operations on upload, storage cost multiplied by 8. Most variants are never requested (a mobile app only needs small and medium). On-demand resizing generates a variant only when first requested, caches it at the CDN edge, and never generates it again. Storage is proportional to actual demand, not all possible variants. The tradeoff: first request for a new variant is slower (origin must resize). Mitigate with CDN edge caching and async pre-warming for predictable variants.”}},{“@type”:”Question”,”name”:”How does the CDN pull model eliminate origin load for repeated image requests?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Configure the CDN with an origin server (your image resizing service). First request for /images/product-123_400w.jpg: CDN misses, forwards to origin, origin fetches original from S3, resizes to 400px wide, returns the image with Cache-Control: public, max-age=31536000. CDN caches the resized image at the edge node closest to the requester. All subsequent requests for that URL from that region are served from CDN cache — origin never receives them. CDN hit rates for image-heavy sites are typically 95%+, meaning the origin handles only 5% of image requests. The Cache-Control header with a long max-age is critical — without it, CDN will revalidate with origin on every request.”}},{“@type”:”Question”,”name”:”How do you parse and validate resize parameters securely?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Never pass raw parameters from the URL to ImageMagick or FFmpeg commands — that is command injection. Parse parameters strictly: extract width, height, format, and quality from the URL path or query string using a whitelist. Validate: width and height must be positive integers within allowed bounds (e.g., max 4000px); format must be one of [jpeg, webp, png, avif]; quality must be 1-100. Reject any parameter that fails validation with 400 Bad Request. Use a URL signature (HMAC-SHA256 of the path + secret) to prevent clients from requesting arbitrary dimensions — only URLs signed by your server are accepted. This prevents your service from being used as an arbitrary compute resource.”}},{“@type”:”Question”,”name”:”What image format should you serve by default and how do you detect browser support?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Serve WebP by default (30-50% smaller than JPEG at equivalent quality) when the client supports it. Detect support via the Accept header: if Accept contains image/webp, serve WebP; otherwise serve JPEG. CDN must vary the cached response by Accept header: add Vary: Accept to the response so the CDN stores separate WebP and JPEG variants per URL. For next-generation formats: serve AVIF (even smaller than WebP) when Accept contains image/avif. Fallback chain: AVIF → WebP → JPEG. In practice: mobile browsers support WebP universally; Safari 16+ supports AVIF. The CDN Vary header ensures correct format delivery per client without serving JPEG to WebP-capable browsers.”}},{“@type”:”Question”,”name”:”How do you prevent an attacker from exhausting your resize workers?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Three defenses: (1) URL signing — only accept resize requests for URLs signed with a server-side secret. An attacker cannot generate valid signed URLs for arbitrary dimensions. (2) Rate limiting — limit resize requests per IP and per signed URL to prevent replaying valid URLs at high volume. (3) Worker concurrency limits — image resizing is CPU-intensive; cap concurrent resize operations per worker (e.g., 4 goroutines per CPU core). Use a queue with a max depth; reject with 503 when the queue is full rather than accepting unbounded work. Additionally, set timeout on resize operations (5 seconds max) and validate that the source image size is within bounds before downloading it from S3.”}}]}
Image resizing and media processing pipeline design is discussed in Shopify system design interview questions.
Image resizing and media processing system design is covered in Snap system design interview preparation.
Image resizing and listing photo optimization is discussed in Airbnb system design interview guide.