Low Level Design: Priority Queue Service

What Is a Priority Queue Service?

A Priority Queue Service extends a standard job queue by processing tasks in priority order rather than strict FIFO. High-priority work — a payment confirmation, an SLA-critical alert, a VIP user request — jumps ahead of lower-priority background tasks. This requires careful design to prevent starvation, ensure fairness, and scale without turning into a bottleneck.

Data Model

Priority is typically an integer where lower numbers mean higher urgency (like Unix nice values), or an explicit ENUM tier.

CREATE TABLE priority_jobs (
  id            BIGINT        PRIMARY KEY AUTO_INCREMENT,
  queue_name    VARCHAR(128)  NOT NULL,
  priority      TINYINT       NOT NULL DEFAULT 50,   -- 0=critical, 99=low
  payload       JSON          NOT NULL,
  status        ENUM('pending', 'in_flight', 'done', 'failed') DEFAULT 'pending',
  attempts      INT           DEFAULT 0,
  max_attempts  INT           DEFAULT 3,
  run_at        DATETIME      NOT NULL DEFAULT NOW(),
  enqueued_at   DATETIME      DEFAULT NOW(),
  locked_until  DATETIME,
  locked_by     VARCHAR(128),
  INDEX idx_pq_dequeue (queue_name, status, priority ASC, run_at ASC)
);

The composite index on (queue_name, status, priority, run_at) drives efficient ordered dequeue scans.

Core Algorithm

Enqueue: Insert with the appropriate priority integer. Producers must not all use priority 0 or the system degenerates into a normal queue with extra overhead.

Dequeue (priority-ordered lease):

UPDATE priority_jobs
SET    status = 'in_flight',
       locked_by = :worker_id,
       locked_until = NOW() + INTERVAL 60 SECOND,
       attempts = attempts + 1
WHERE  queue_name = :queue
  AND  status = 'pending'
  AND  run_at <= NOW()
ORDER BY priority ASC, run_at ASC
LIMIT 1;

Workers always pull the lowest priority number first. Ties are broken by run_at (FIFO within a tier).

Starvation Prevention

Pure priority ordering starves low-priority tasks when high-priority work never stops. Two common mitigations:

  • Aging: Periodically bump the effective priority of tasks that have waited beyond a threshold. Implement with a computed column or a background job that decrements priority for old pending rows.
  • Weighted fair queuing: Allocate worker slots proportionally — e.g., 60% of workers reserved for priority < 20, 30% for priority 20-60, 10% for the rest.

Failure Handling

At-Least-Once Delivery

Same lease-timeout sweeper as a plain job queue: a background process resets in_flight rows whose locked_until has expired back to pending, preserving the original priority.

Idempotency

Include an idempotency_key in the payload. High-priority tasks are especially likely to be retried aggressively by impatient callers, so the handler must check for duplicate execution (e.g., query a results table keyed on idempotency_key before doing work).

Poison Messages

Failed tasks that exhaust max_attempts are moved to status = failed. Maintain a separate DLQ table or add a dlq_priority column so operators can re-enqueue at the original priority once the underlying bug is fixed.

Scalability Considerations

  • Separate physical queues per tier — instead of one table ordered by priority, use separate queues (or Kafka topics/SQS queues) per tier and dedicate worker pools to each. Simpler ordering, easier scaling.
  • In-memory heap for hot path — keep the top-N pending tasks in a Redis sorted set (score = priority * 1e12 + unix_timestamp) for sub-millisecond dequeue. Persist to SQL for durability and recovery.
  • Rate limiting per priority — cap how fast critical tasks can consume shared downstream resources (DB connections, external APIs) to avoid cascading failures.
  • Observability — track per-priority queue depth, wait-time p50/p99, and starvation events. Alert when low-priority wait time exceeds SLO.

Summary

A Priority Queue Service ensures critical work gets processed first without abandoning lower-priority tasks. The key design decisions are: how to represent priority (integer vs. enum vs. separate queues), how to prevent starvation (aging or weighted workers), and whether to use a database-backed or in-memory heap for the hot path. In interviews, articulate the tradeoff between simplicity (one ordered table) and performance (Redis sorted set with SQL durability) and explain how aging prevents starvation in a purely numeric scheme.

{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “What is a priority queue service and when should you use one?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “A priority queue service processes jobs or messages in order of their assigned priority rather than arrival time. It is appropriate when certain tasks — such as payment processing or alerting — must be handled before lower-priority work like analytics or batch exports.”
}
},
{
“@type”: “Question”,
“name”: “How is a priority queue implemented at the infrastructure level?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Common approaches include using a heap-based data structure for in-process queues, separate physical queues per priority tier with worker pools weighted accordingly, or a sorted set in Redis where the score represents priority. Each approach involves trade-offs between latency, throughput, and operational complexity.”
}
},
{
“@type”: “Question”,
“name”: “How do you prevent starvation of low-priority jobs in a priority queue?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Starvation is avoided through aging — gradually increasing a job’s effective priority the longer it waits. A maximum wait time cap or a guaranteed minimum throughput quota for each priority tier are also standard mitigations used by companies like Google and Atlassian in their internal task schedulers.”
}
},
{
“@type”: “Question”,
“name”: “What data is stored per job in a priority queue service?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Each job record typically stores a unique job ID, payload, priority score, enqueue timestamp, status (pending, running, completed, failed), retry count, owner or tenant ID, and an optional scheduled execution time. This metadata drives scheduling decisions and observability.”
}
}
]
}

See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering

See also: Netflix Interview Guide 2026: Streaming Architecture, Recommendation Systems, and Engineering Excellence

See also: Atlassian Interview Guide

Scroll to Top