Low Level Design: Shipment Tracking Service

Shipment Tracking Service: Low Level Design

Core Data Model

The tracking service maintains two primary tables: Shipment (one record per carrier shipment) and TrackingEvent (append-only timeline).

-- Shipments table
CREATE TABLE shipments (
    id                BIGSERIAL PRIMARY KEY,
    order_id          BIGINT NOT NULL,
    tracking_number   VARCHAR(64) NOT NULL,
    carrier           VARCHAR(10) NOT NULL,  -- fedex | ups | usps | dhl
    status            VARCHAR(30) NOT NULL DEFAULT 'created',
                      -- created | in_transit | out_for_delivery | delivered | exception
    estimated_delivery DATE,
    created_at        TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

CREATE UNIQUE INDEX idx_shipments_tracking ON shipments(carrier, tracking_number);

-- Tracking events (append-only)
CREATE TABLE tracking_events (
    id                BIGSERIAL PRIMARY KEY,
    shipment_id       BIGINT NOT NULL REFERENCES shipments(id),
    carrier_status_code VARCHAR(20),
    normalized_status VARCHAR(30) NOT NULL,
    description       TEXT,
    location          VARCHAR(255),
    occurred_at       TIMESTAMPTZ NOT NULL
);

CREATE INDEX idx_tracking_events_shipment ON tracking_events(shipment_id, occurred_at DESC);

Carrier Integration Strategy

Two ingestion modes, preferring webhooks:

-- Polling (fallback, every 4 hours per active shipment)
Scheduler → CarrierPoller → Carrier API → normalize → upsert TrackingEvent

-- Webhook push (preferred)
Carrier → POST /webhooks/carrier/{carrier_name} → validate HMAC
  → normalize → upsert TrackingEvent → publish StatusChangeEvent

Active shipments are those with status NOT IN ('delivered', 'exception'). The polling scheduler uses a priority queue — shipments with estimated delivery today or tomorrow are polled more frequently (every 30 min).

Carrier Status Normalization

Each carrier uses proprietary status codes. A normalization layer maps them to internal statuses:

-- Example normalization rules (stored in config / DB)
| Carrier | Raw Code | Normalized Status    |
|---------|----------|----------------------|
| FedEx   | IT       | in_transit           |
| FedEx   | OD       | out_for_delivery     |
| FedEx   | DL       | delivered            |
| FedEx   | DE       | exception            |
| UPS     | I        | in_transit           |
| UPS     | O        | out_for_delivery     |
| UPS     | D        | delivered            |
| USPS    | 03       | in_transit           |
| USPS    | 01       | delivered            |
| DHL     | TR       | in_transit           |
| DHL     | DF       | out_for_delivery     |

Unknown codes fall back to in_transit and are logged for review. Normalization rules are versioned and hot-reloadable without a deploy.

Estimated Delivery Date Prediction

A lightweight ML model replaces static carrier EDD promises, which are often optimistic:

Features:
  - carrier (categorical)
  - service_level: ground / express / overnight (categorical)
  - origin_zip, dest_zip (embed as region cluster)
  - ship_date (date)
  - day_of_week (int 0-6)
  - current_normalized_status (categorical)

Target: actual delivery date (regression)

Model: gradient-boosted trees (XGBoost), retrained weekly on
       historical shipment data. Served via internal scoring API.

Fallback: carrier-provided EDD if model confidence < threshold.

Notification Triggers

After each status change event is processed:

  • out_for_delivery: push notification + email (subject: Your package is on its way today).
  • delivered: push notification + email with delivery confirmation.
  • exception: email alert + auto-create support ticket in helpdesk system.

Notifications are deduplicated by (shipment_id, normalized_status) so repeated webhook deliveries do not spam customers.

Exception Handling & Proactive Re-routing

When status = exception:

  1. Alert customer via email/push with exception description.
  2. Auto-create support ticket tagged delivery-exception.
  3. Query carrier API for nearest pickup locations using dest_zip.
  4. Surface pickup options on tracking page with one-click redirect-to-pickup action.

Public Tracking Page

No authentication required — accessible via /track/{tracking_number} or /track?order={order_id}.

GET /track/{carrier}/{tracking_number}
Response:
{
  tracking_number: ...,
  carrier: fedex,
  status: in_transit,
  estimated_delivery: 2025-04-20,
  events: [
    { occurred_at: ..., status: in_transit,
      description: Departed FedEx facility, location: Memphis TN },
    ...
  ]
}

The tracking page response is cached at CDN edge (TTL: 5 min) with cache-key on tracking number. Webhooks bust the cache on status change.

Scalability Considerations

  • Webhook ingestion is stateless and horizontally scalable behind a load balancer.
  • TrackingEvent writes are append-only — no update contention.
  • Polling workers are sharded by carrier to avoid thundering herd against a single carrier API.
  • EDD scoring API is cached per shipment for 1 hour to reduce ML inference cost.

Key API Endpoints

POST /shipments                          -- create shipment record
GET  /shipments/{id}                     -- get shipment + events
GET  /track/{carrier}/{tracking_number}  -- public tracking (CDN-cached)
POST /webhooks/carrier/{carrier}         -- inbound carrier webhooks
GET  /shipments?order_id=               -- list shipments for order

See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering

See also: Shopify Interview Guide

See also: Netflix Interview Guide 2026: Streaming Architecture, Recommendation Systems, and Engineering Excellence

Scroll to Top