Question 1

How do you handle large file uploads in a media processing pipeline?

Accepted Answer

Use pre-signed S3 URLs: the client requests a pre-signed URL from the API, then uploads directly to S3. The file never passes through the application server, avoiding memory pressure and bandwidth costs on the API tier. S3 triggers an event notification on upload completion, which enqueues a processing job. For very large files, use S3 multipart upload to parallelize upload and enable resumable uploads.

Question 2

How does adaptive bitrate streaming (HLS/DASH) work?

Accepted Answer

Adaptive bitrate streaming transcodes a video into multiple quality levels (1080p, 720p, 480p, 360p) and splits each level into small segments (2-10 seconds). A manifest file (m3u8 for HLS, MPD for DASH) lists all available segments and quality levels. The video player downloads the manifest, measures available bandwidth, selects the appropriate quality level, and downloads segments. If bandwidth changes, the player switches quality levels between segments.

Question 3

How do you speed up video transcoding for long videos?

Accepted Answer

Split the source video into chunks (e.g., 1-minute segments), transcode each chunk in parallel across multiple worker instances, then concatenate the encoded segments. This reduces a 2-hour transcoding job from hours to minutes by utilizing horizontal scaling. Track segment completion in Redis; trigger the final concatenation step when all segments are marked done.

Question 4

Why use content-addressable URLs for media assets?

Accepted Answer

Content-addressable URLs (based on a hash of the content or a UUID) are immutable: the URL never changes after publish. This allows CDN caching with infinite TTL (Cache-Control: max-age=31536000, immutable). Updating media creates a new URL rather than invalidating the old one. This eliminates cache invalidation overhead and ensures CDN edge nodes always serve the correct content without requiring a purge.

Low Level Design: Media Processing Pipeline

Upload Ingestion

Job Queue and Workers

Video Transcoding

Parallel Segment Processing

Image Processing

Content Moderation

Storage and CDN Delivery

Progress Tracking and Notifications