Low Level Design: Distributed Job Scheduler

A distributed job scheduler executes tasks at specified times or intervals across a cluster of workers. It must handle at-least-once execution, failure recovery, backpressure, and fair scheduling across tenants. The core components are a job store, a scheduler process that triggers jobs, and a worker pool that executes them.

Job Store

Jobs are persisted in a database: job_id, type, payload (JSON), schedule (cron expression or next_run_at timestamp), status (pending, running, completed, failed), max_retries, retry_count, created_at, updated_at, last_run_at. Index on (status, next_run_at) for efficient polling. Use a relational database for small-to-medium scale; for high throughput, partition by next_run_at or use a purpose-built job store (Sidekiq backed by Redis, Temporal, Apache Airflow).

Leader-Based Scheduling

A scheduler process polls the job store for jobs due to run (next_run_at <= now AND status = pending). To avoid duplicate scheduling in a multi-instance deployment, use leader election: only the leader instance polls and schedules. Leader election via distributed lock (Redis SETNX with TTL, etcd, or ZooKeeper). If the leader fails, a follower acquires the lock and takes over within one TTL period (typically 10-30 seconds).

Optimistic Locking for Job Acquisition

Workers claim jobs atomically using optimistic locking or database-level locking: UPDATE jobs SET status='running', worker_id=X, started_at=now WHERE job_id=Y AND status='pending'. The rowcount check confirms exclusive acquisition (rowcount=1 means this worker won the claim). Alternatively, use SELECT FOR UPDATE SKIP LOCKED (PostgreSQL) to efficiently claim the next unclaimed job without lock contention.

Heartbeat and Failure Detection

Running workers periodically update a heartbeat timestamp on the job row. A watchdog process (or the scheduler) monitors for jobs with status=running but heartbeat older than a threshold (e.g., 5 minutes). These are stuck jobs: the worker crashed without completing. Reset their status to pending for retry. Heartbeat interval should be much shorter than the stuck threshold to avoid false positives.

Cron Expression Parsing

Recurring jobs use cron expressions (e.g., 0 2 * * * for daily at 2am). After a job completes, compute the next scheduled time by evaluating the cron expression against the current time. Libraries like croniter (Python) or cron-parser (Node.js) handle daylight saving transitions, month-end edge cases, and complex expressions (0 */4 * * * for every 4 hours). Store next_run_at as a UTC timestamp.

Priority Queues

Jobs are categorized into priority tiers: critical (payment processing), high (email sending), normal (report generation), low (cleanup jobs). Workers pull from higher-priority queues first. In Redis-backed queues (Sidekiq), each priority is a separate list. In database-backed schedulers, add a priority column and ORDER BY priority DESC, next_run_at ASC. Starvation prevention: promote low-priority jobs to higher priority if they wait beyond a threshold.

Idempotency and Exactly-Once Semantics

At-least-once execution is the standard guarantee: a job may run more than once due to worker crash and retry. Job handlers must be idempotent: running the same job twice produces the same result. Use an idempotency key (job_id) to deduplicate: check if the effect of the job has already been applied before executing. For truly exactly-once semantics, use transactional outbox: commit the job result and the job status update in the same database transaction.

Workflow Orchestration

Complex jobs with dependencies (DAGs) require workflow orchestration beyond simple scheduling. Tools like Apache Airflow, Temporal, and Prefect model dependencies as directed acyclic graphs: task B runs only after task A completes successfully. The orchestrator tracks DAG state, handles partial failures (retry only failed branches), and provides a visual DAG editor and monitoring UI. Job scheduler + workflow engine handles the full spectrum from simple cron to complex pipelines.

See also: Meta Interview Guide 2026: Facebook, Instagram, WhatsApp Engineering

See also: Uber Interview Guide 2026: Dispatch Systems, Geospatial Algorithms, and Marketplace Engineering

See also: Netflix Interview Guide 2026: Streaming Architecture, Recommendation Systems, and Engineering Excellence

See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering

See also: LinkedIn Interview Guide 2026: Social Graph Engineering, Feed Ranking, and Professional Network Scale

See also: Airbnb Interview Guide 2026: Search Systems, Trust and Safety, and Full-Stack Engineering

See also: Databricks Interview Guide 2026: Spark Internals, Delta Lake, and Lakehouse Architecture

See also: Anthropic Interview Guide 2026: Process, Questions, and AI Safety

See also: Atlassian Interview Guide

See also: Coinbase Interview Guide

See also: Shopify Interview Guide

See also: Snap Interview Guide

See also: Lyft Interview Guide 2026: Rideshare Engineering, Real-Time Dispatch, and Safety Systems

See also: Stripe Interview Guide 2026: Process, Bug Bash Round, and Payment Systems

Scroll to Top