An e-learning platform must handle course catalogues, student enrollment, video delivery, quizzes, and certificates at scale. This low level design walks through every major component.
Course and Lesson Schema
The courses table is the root entity:
courses
-------
id BIGINT PK
title VARCHAR(255)
instructor_id BIGINT FK → users
category VARCHAR(100)
difficulty ENUM('beginner','intermediate','advanced')
price DECIMAL(10,2)
status ENUM('draft','published','archived')
created_at TIMESTAMP
Each course contains ordered lessons:
lessons
-------
id BIGINT PK
course_id BIGINT FK → courses
order INT
type ENUM('video','text','quiz')
content_url VARCHAR(512)
duration INT -- seconds
Enrollment and Progress Tracking
Enrollment records tie a user to a course:
enrollments
-----------
user_id BIGINT FK → users
course_id BIGINT FK → courses
enrolled_at TIMESTAMP
status ENUM('active','completed','refunded')
completion_pct DECIMAL(5,2)
PRIMARY KEY (user_id, course_id)
Progress is event-driven. When a student finishes a lesson, a lesson_completion event fires. A background worker reads these events and recalculates completion_pct as (completed_lessons / total_lessons) * 100, then writes back to the enrollments row. Completed lessons are stored in a lesson_progress table keyed by (user_id, lesson_id) with a completed_at timestamp.
Video Delivery with HLS
Raw video uploads go to object storage (S3-compatible). A transcoding pipeline converts them to HLS (HTTP Live Streaming) with multiple quality levels (360p, 720p, 1080p). The player requests the .m3u8 manifest and streams segments. Resume position is stored server-side in a video_positions table:
video_positions --------------- user_id BIGINT lesson_id BIGINT position INT -- seconds updated_at TIMESTAMP PRIMARY KEY (user_id, lesson_id)
The player POSTs the current position every 10 seconds via a lightweight API endpoint, so students resume exactly where they left off across devices.
Quiz Engine
Quiz lessons reference a quizzes table. Each quiz has many questions; each question has multiple options with one or more correct answers:
quizzes: id, lesson_id, pass_score (%)
questions: id, quiz_id, text, type ENUM('single','multi')
options: id, question_id, text, is_correct BOOLEAN
On submission, the API scores the attempt by comparing selected option IDs against is_correct flags. The attempt record stores score, passed boolean, and timestamps. Students may retake quizzes; all attempts are retained for instructor analytics.
Certificate Generation
When completion_pct reaches 100, a certificate generation job is queued. The job produces a PDF containing the student name, course title, completion date, a unique cert_id (UUID), and a QR code linking to a public verification endpoint (/verify/{cert_id}). The PDF is stored in object storage and the URL is written to a certificates table alongside cert_id, user_id, and course_id. Verification is a simple public GET that looks up cert_id and returns course and student name.
Recommendation Engine
On course completion, a recommendation job reads the completed course category and difficulty level, then queries for published courses in the same category at the next difficulty tier that the student has not yet enrolled in. Results are ranked by enrollment count (popularity). A simple collaborative filter can also be layered on top: find users with similar completion histories and recommend courses they finished that this student has not started.
Instructor Dashboard
Instructors see aggregated metrics per course: total enrollments, active vs. completed counts, revenue (enrollments × price minus platform fee), average quiz scores per question, and drop-off rate by lesson (percentage of enrolled students who completed each lesson). These are computed by read replicas or a reporting database refreshed periodically to avoid load on the primary.
Frequently Asked Questions
Q: How do you design adaptive video streaming for an e-learning platform?
A: Use HLS (HTTP Live Streaming) with multiple renditions encoded at different bitrates (e.g., 360p, 720p, 1080p). A media server (e.g., AWS Elemental MediaConvert) segments video into short chunks and generates a master .m3u8 playlist. The client player monitors available bandwidth and switches renditions mid-stream. Store segments on a CDN so chunks are served from edge nodes close to the learner, minimising buffering and reducing origin load.
Q: How do you build a quiz engine that randomises questions without repetition?
A: Store questions in a relational table with columns for course_id, difficulty, and topic tags. At quiz-start, select N questions using a seeded Fisher-Yates shuffle keyed on (user_id + attempt_number) so the same user never sees the same order twice. Persist the shuffled question IDs in a quiz_session table to allow resume. For answer evaluation, store correct option hashes server-side and never expose them in the API response.
Q: How do you track and persist course progress for millions of learners?
A: Model progress as (user_id, course_id, lesson_id, status, watched_seconds, last_updated). Write progress updates to a Redis sorted set (score = watched_seconds) for fast in-session reads, then flush to a relational store asynchronously via a background worker every 30 seconds. Calculate overall course completion percentage with a single aggregation query on the lessons table. Use database-level upsert (INSERT … ON CONFLICT UPDATE) to avoid duplicates from concurrent tab sessions.
Q: How do you generate certificates with QR code verification?
A: On course completion, generate a UUID certificate_id and store a record with (certificate_id, user_id, course_id, issued_at, hash). Compute hash = HMAC-SHA256(certificate_id + user_id + issued_at, server_secret). Embed this URL in a QR code: https://example.com/verify?id=<certificate_id>. The verification endpoint fetches the record, recomputes the hash, and returns a JSON response confirming authenticity. Render the certificate PDF server-side with a library like TCPDF or WeasyPrint, caching the PDF on object storage.
Q: How do you design CDN delivery for video content at scale?
A: Store master video files in object storage (S3 or GCS). A transcoding pipeline produces HLS/DASH segment files and uploads them to the CDN origin. Configure the CDN with long cache TTLs for immutable segment files (e.g., Cache-Control: max-age=31536000) and short TTLs for the mutable playlist (.m3u8). Use signed URLs with expiry times to prevent hotlinking. Invalidate cached playlists on re-encode. For live or near-live lectures, use an RTMP ingest endpoint that feeds directly into the CDN’s live streaming service.
See also: Databricks Interview Guide 2026: Spark Internals, Delta Lake, and Lakehouse Architecture