What is an Image Processing Service?
An image processing service handles on-demand or batch transformations of images: resize, crop, compress, convert format, apply filters, generate thumbnails, and extract metadata. Used by every photo-sharing and e-commerce platform. Pinterest, Instagram, Shopify, and Cloudinary all run at-scale image processing. The key design considerations are: process on upload vs. process on demand, caching strategy, and storage cost vs. compute cost tradeoffs.
Two Architectures: Pre-Processing vs. On-Demand
Pre-processing: when an image is uploaded, immediately generate all required variants (thumbnail, medium, large, WebP). Store all variants in object storage. Simple and fast to serve, but wastes storage for variants that are never requested.
On-demand processing: store only the original. Transform on first request, cache the result. Serves exact transformations requested, storage-efficient, but first-request latency is higher.
Hybrid (recommended): pre-generate common variants at upload time (thumbnail, standard sizes). On-demand for rare or custom transformations. Cache all processed results in CDN.
URL-Based Transformation API
Encode transformation parameters in the URL (Cloudinary-style):
# Original: https://images.example.com/original/product-123.jpg # Resize to 400×300, convert to WebP: https://images.example.com/w_400,h_300,f_webp/product-123.jpg # Crop to square, resize to 200×200, quality 80: https://images.example.com/c_fill,w_200,h_200,q_80/product-123.jpg # Parameters: # w = width, h = height # c = crop mode: fill (crop to fit), fit (letterbox), pad # f = format: jpg, webp, avif, png # q = quality: 1-100 # g = gravity for crop: face, center, north, auto
Request Processing Flow
1. Request arrives at CDN (CloudFront) 2. CDN cache HIT → return cached image (most requests) 3. CDN cache MISS → forward to Image Processing Service 4. Service parses URL: extract original key + transformation params 5. Check processed image cache (S3 or Redis): cache_key = SHA256(original_key + canonical_params) 6. If cached: return from S3/Redis 7. If not cached: a. Fetch original from S3 b. Apply transformations with libvips or Pillow/sharp c. Store result in S3 (processed bucket) under cache_key d. Return response with Cache-Control: public, max-age=31536000 8. CDN caches response for subsequent requests
Processing with libvips
Use libvips (via pyvips or sharp for Node.js) — it’s 4-8× faster than ImageMagick and 2× faster than Pillow for most operations because it processes images in streaming pipelines without loading the full image into RAM:
import pyvips
def process_image(source_bytes, params):
img = pyvips.Image.new_from_buffer(source_bytes, '')
# Crop
if params.get('c') == 'fill':
img = smart_crop(img, params['w'], params['h'])
# Resize
if 'w' in params or 'h' in params:
img = img.thumbnail_image(
params.get('w', img.width),
height=params.get('h', img.height),
size=pyvips.Size.DOWN
)
# Format and quality
fmt = params.get('f', 'jpg')
quality = int(params.get('q', 85))
output = img.write_to_buffer(
f'.{fmt}',
Q=quality,
strip=True # remove EXIF metadata
)
return output
Format Optimization
Serve modern formats when the browser supports them:
# Check Accept header
if 'image/avif' in request.headers.get('Accept', ''):
format = 'avif' # 50% smaller than JPEG
elif 'image/webp' in request.headers.get('Accept', ''):
format = 'webp' # 30% smaller than JPEG
else:
format = 'jpg' # fallback
# Or use URL param: f_auto to let service decide
AVIF reduces file size ~50% vs JPEG at same quality. WebP reduces ~30%. Browser support: AVIF is supported by Chrome, Firefox, Safari 16+. Use f_auto (or content negotiation) to serve the best format per browser.
Key Design Decisions
- CDN as the primary cache — 99%+ of image requests served from CDN edge, no compute
- URL-based transformation API — stateless, cacheable, no separate API call to get transformed images
- libvips for processing — streaming pipeline, low memory footprint, much faster than alternatives
- EXIF stripping — removes GPS coordinates and sensitive metadata from processed outputs
- Immutable cache keys — SHA256(original + params) ensures cache is always valid; originals never change
Image processing pipeline design at scale is discussed in Meta system design interview guide.
Image and media processing service design is covered in Snap system design interview questions.
Product image processing and CDN serving design is in Shopify system design interview preparation.
See also: Airbnb Interview Guide 2026: Search Systems, Trust and Safety, and Full-Stack Engineering