Question 1

Why use pre-signed S3 URLs for video upload instead of uploading through your servers?

Accepted Answer

Pre-signed S3 URLs allow clients to upload directly to S3, bypassing application servers entirely. Benefits: (1) Bandwidth: your servers never touch video bytes - a 10 GB upload does not consume your server bandwidth. (2) Scalability: S3 handles multi-part upload, resumability, and parallel chunk uploads natively. (3) Cost: no EC2 bandwidth charges for video bytes. (4) Simplicity: your API only handles metadata (title, description) - a lightweight request. Workflow: client requests a pre-signed URL from your API (valid 1 hour), uploads directly to S3, S3 fires an ObjectCreated event that triggers your transcoding pipeline.

Question 2

How does HLS adaptive bitrate streaming work?

Accepted Answer

HLS (HTTP Live Streaming) works as follows: each video is transcoded into multiple quality levels (240p to 1080p). Each level is segmented into 6-10 second .ts chunks. A master playlist (.m3u8) lists all available streams with their bandwidth requirements. Each stream has a media playlist listing its chunk URLs. The player downloads the master playlist, measures available bandwidth, selects the best matching stream, then prefetches 2-3 chunks ahead. When bandwidth changes, the player switches streams at the next chunk boundary - seamlessly, within the same playback session. All chunks are immutable and cached indefinitely at CDN edge nodes.

Question 3

How do you scale video transcoding to handle 500 hours of uploads per minute?

Accepted Answer

Use a job queue (SQS) + auto-scaling worker pool. Each upload creates a transcoding job in SQS. A fleet of CPU-optimized EC2 instances (c5.4xlarge or GPU-enabled for hardware acceleration) poll the queue. Each worker transcodes one video at a time using FFmpeg, producing all resolution variants in parallel threads. CloudWatch alarm on SQS queue depth triggers ASG scale-out. Cost optimization: use Spot instances (stateless workers can restart interrupted jobs). A c5.4xlarge transcodes ~10x real-time speed; 100 workers handle 1000x real-time throughput.

Question 4

How do you serve video chunks efficiently at scale?

Accepted Answer

Store all HLS chunks in S3 with content-addressed URLs (chunk URL includes content hash, so it is immutable and cacheable indefinitely). Serve through CloudFront CDN. For popular videos, CDN hit rate is 95%+, so most requests never reach S3. Add an Origin Shield (a regional CloudFront layer between edge nodes and S3) to further reduce S3 requests on cache misses. For the first few minutes after upload, when the video is not yet cached, a single CloudFront distribution with multiple edge locations ensures low latency globally. Video segment TTL: max-age=31536000 (1 year) since chunks are immutable.

Question 5

How do you generate thumbnails automatically during video processing?

Accepted Answer

During transcoding, extract a frame at approximately 10% of the total video duration using FFmpeg: ffmpeg -ss {duration*0.1} -i {input} -vframes 1 -q:v 2 {output.jpg}. Generate multiple thumbnail candidates (at 10%, 25%, 50% of duration) and store all in S3. For better thumbnails, run a ML-based quality scorer to select the most visually appealing frame. Store the selected thumbnail URL on the Video record. For user-uploaded custom thumbnails, validate the image dimensions match the video aspect ratio, resize to standard sizes (1280x720), and store alongside auto-generated ones.

System Design: Video Processing Platform — Upload, Transcoding, Storage, and Streaming (2025)

Requirements and Scale

Upload Flow

Transcoding Pipeline

Adaptive Bitrate Streaming (HLS/DASH)

Scalability: Transcoding Farm