Low Level Design: Shipment Tracking Service

Shipment Tracking Service: Low Level Design

Core Data Model

The tracking service maintains two primary tables: Shipment (one record per carrier shipment) and TrackingEvent (append-only timeline).

-- Shipments table
CREATE TABLE shipments (
    id                BIGSERIAL PRIMARY KEY,
    order_id          BIGINT NOT NULL,
    tracking_number   VARCHAR(64) NOT NULL,
    carrier           VARCHAR(10) NOT NULL,  -- fedex | ups | usps | dhl
    status            VARCHAR(30) NOT NULL DEFAULT 'created',
                      -- created | in_transit | out_for_delivery | delivered | exception
    estimated_delivery DATE,
    created_at        TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

CREATE UNIQUE INDEX idx_shipments_tracking ON shipments(carrier, tracking_number);

-- Tracking events (append-only)
CREATE TABLE tracking_events (
    id                BIGSERIAL PRIMARY KEY,
    shipment_id       BIGINT NOT NULL REFERENCES shipments(id),
    carrier_status_code VARCHAR(20),
    normalized_status VARCHAR(30) NOT NULL,
    description       TEXT,
    location          VARCHAR(255),
    occurred_at       TIMESTAMPTZ NOT NULL
);

CREATE INDEX idx_tracking_events_shipment ON tracking_events(shipment_id, occurred_at DESC);

Carrier Integration Strategy

Two ingestion modes, preferring webhooks:

-- Polling (fallback, every 4 hours per active shipment)
Scheduler → CarrierPoller → Carrier API → normalize → upsert TrackingEvent

-- Webhook push (preferred)
Carrier → POST /webhooks/carrier/{carrier_name} → validate HMAC
  → normalize → upsert TrackingEvent → publish StatusChangeEvent

Active shipments are those with status NOT IN ('delivered', 'exception'). The polling scheduler uses a priority queue — shipments with estimated delivery today or tomorrow are polled more frequently (every 30 min).

Carrier Status Normalization

Each carrier uses proprietary status codes. A normalization layer maps them to internal statuses:

-- Example normalization rules (stored in config / DB)
| Carrier | Raw Code | Normalized Status    |
|---------|----------|----------------------|
| FedEx   | IT       | in_transit           |
| FedEx   | OD       | out_for_delivery     |
| FedEx   | DL       | delivered            |
| FedEx   | DE       | exception            |
| UPS     | I        | in_transit           |
| UPS     | O        | out_for_delivery     |
| UPS     | D        | delivered            |
| USPS    | 03       | in_transit           |
| USPS    | 01       | delivered            |
| DHL     | TR       | in_transit           |
| DHL     | DF       | out_for_delivery     |

Unknown codes fall back to in_transit and are logged for review. Normalization rules are versioned and hot-reloadable without a deploy.

Estimated Delivery Date Prediction

A lightweight ML model replaces static carrier EDD promises, which are often optimistic:

Features:
  - carrier (categorical)
  - service_level: ground / express / overnight (categorical)
  - origin_zip, dest_zip (embed as region cluster)
  - ship_date (date)
  - day_of_week (int 0-6)
  - current_normalized_status (categorical)

Target: actual delivery date (regression)

Model: gradient-boosted trees (XGBoost), retrained weekly on
       historical shipment data. Served via internal scoring API.

Fallback: carrier-provided EDD if model confidence < threshold.

Notification Triggers

After each status change event is processed:

  • out_for_delivery: push notification + email (subject: Your package is on its way today).
  • delivered: push notification + email with delivery confirmation.
  • exception: email alert + auto-create support ticket in helpdesk system.

Notifications are deduplicated by (shipment_id, normalized_status) so repeated webhook deliveries do not spam customers.

Exception Handling & Proactive Re-routing

When status = exception:

  1. Alert customer via email/push with exception description.
  2. Auto-create support ticket tagged delivery-exception.
  3. Query carrier API for nearest pickup locations using dest_zip.
  4. Surface pickup options on tracking page with one-click redirect-to-pickup action.

Public Tracking Page

No authentication required — accessible via /track/{tracking_number} or /track?order={order_id}.

GET /track/{carrier}/{tracking_number}
Response:
{
  tracking_number: ...,
  carrier: fedex,
  status: in_transit,
  estimated_delivery: 2025-04-20,
  events: [
    { occurred_at: ..., status: in_transit,
      description: Departed FedEx facility, location: Memphis TN },
    ...
  ]
}

The tracking page response is cached at CDN edge (TTL: 5 min) with cache-key on tracking number. Webhooks bust the cache on status change.

Scalability Considerations

  • Webhook ingestion is stateless and horizontally scalable behind a load balancer.
  • TrackingEvent writes are append-only — no update contention.
  • Polling workers are sharded by carrier to avoid thundering herd against a single carrier API.
  • EDD scoring API is cached per shipment for 1 hour to reduce ML inference cost.

Key API Endpoints

POST /shipments                          -- create shipment record
GET  /shipments/{id}                     -- get shipment + events
GET  /track/{carrier}/{tracking_number}  -- public tracking (CDN-cached)
POST /webhooks/carrier/{carrier}         -- inbound carrier webhooks
GET  /shipments?order_id=               -- list shipments for order

{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “How does a shipment tracking service ingest carrier data?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Two modes: webhook push (preferred) where the carrier POSTs status updates in real time, and polling fallback where the service calls the carrier API every 4 hours for active shipments. Shipments near their estimated delivery date are polled more frequently (every 30 minutes).”
}
},
{
“@type”: “Question”,
“name”: “How are carrier-specific status codes normalized?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “A normalization layer maps each carrier's proprietary codes to a shared internal status set (created, in_transit, out_for_delivery, delivered, exception). For example, FedEx ''IT'' maps to in_transit. Rules are stored in versioned configuration and hot-reloadable without a deploy.”
}
},
{
“@type”: “Question”,
“name”: “How is estimated delivery date predicted for shipments?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “A gradient-boosted ML model uses features including carrier, service level, origin and destination zip codes, ship date, and day of week to predict actual delivery date. It's retrained weekly on historical data and falls back to the carrier's own EDD when model confidence is low.”
}
},
{
“@type”: “Question”,
“name”: “What happens when a shipment enters exception status?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “The system alerts the customer via email and push notification, auto-creates a support ticket tagged delivery-exception, queries the carrier API for nearby pickup locations, and surfaces redirect-to-pickup options on the public tracking page.”
}
}
]
}

See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering

See also: Shopify Interview Guide

See also: Netflix Interview Guide 2026: Streaming Architecture, Recommendation Systems, and Engineering Excellence

Scroll to Top