Shipment Tracking Service: Low Level Design
Core Data Model
The tracking service maintains two primary tables: Shipment (one record per carrier shipment) and TrackingEvent (append-only timeline).
-- Shipments table
CREATE TABLE shipments (
id BIGSERIAL PRIMARY KEY,
order_id BIGINT NOT NULL,
tracking_number VARCHAR(64) NOT NULL,
carrier VARCHAR(10) NOT NULL, -- fedex | ups | usps | dhl
status VARCHAR(30) NOT NULL DEFAULT 'created',
-- created | in_transit | out_for_delivery | delivered | exception
estimated_delivery DATE,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE UNIQUE INDEX idx_shipments_tracking ON shipments(carrier, tracking_number);
-- Tracking events (append-only)
CREATE TABLE tracking_events (
id BIGSERIAL PRIMARY KEY,
shipment_id BIGINT NOT NULL REFERENCES shipments(id),
carrier_status_code VARCHAR(20),
normalized_status VARCHAR(30) NOT NULL,
description TEXT,
location VARCHAR(255),
occurred_at TIMESTAMPTZ NOT NULL
);
CREATE INDEX idx_tracking_events_shipment ON tracking_events(shipment_id, occurred_at DESC);
Carrier Integration Strategy
Two ingestion modes, preferring webhooks:
-- Polling (fallback, every 4 hours per active shipment)
Scheduler → CarrierPoller → Carrier API → normalize → upsert TrackingEvent
-- Webhook push (preferred)
Carrier → POST /webhooks/carrier/{carrier_name} → validate HMAC
→ normalize → upsert TrackingEvent → publish StatusChangeEvent
Active shipments are those with status NOT IN ('delivered', 'exception'). The polling scheduler uses a priority queue — shipments with estimated delivery today or tomorrow are polled more frequently (every 30 min).
Carrier Status Normalization
Each carrier uses proprietary status codes. A normalization layer maps them to internal statuses:
-- Example normalization rules (stored in config / DB)
| Carrier | Raw Code | Normalized Status |
|---------|----------|----------------------|
| FedEx | IT | in_transit |
| FedEx | OD | out_for_delivery |
| FedEx | DL | delivered |
| FedEx | DE | exception |
| UPS | I | in_transit |
| UPS | O | out_for_delivery |
| UPS | D | delivered |
| USPS | 03 | in_transit |
| USPS | 01 | delivered |
| DHL | TR | in_transit |
| DHL | DF | out_for_delivery |
Unknown codes fall back to in_transit and are logged for review. Normalization rules are versioned and hot-reloadable without a deploy.
Estimated Delivery Date Prediction
A lightweight ML model replaces static carrier EDD promises, which are often optimistic:
Features:
- carrier (categorical)
- service_level: ground / express / overnight (categorical)
- origin_zip, dest_zip (embed as region cluster)
- ship_date (date)
- day_of_week (int 0-6)
- current_normalized_status (categorical)
Target: actual delivery date (regression)
Model: gradient-boosted trees (XGBoost), retrained weekly on
historical shipment data. Served via internal scoring API.
Fallback: carrier-provided EDD if model confidence < threshold.
Notification Triggers
After each status change event is processed:
- out_for_delivery: push notification + email (subject: Your package is on its way today).
- delivered: push notification + email with delivery confirmation.
- exception: email alert + auto-create support ticket in helpdesk system.
Notifications are deduplicated by (shipment_id, normalized_status) so repeated webhook deliveries do not spam customers.
Exception Handling & Proactive Re-routing
When status = exception:
- Alert customer via email/push with exception description.
- Auto-create support ticket tagged
delivery-exception. - Query carrier API for nearest pickup locations using dest_zip.
- Surface pickup options on tracking page with one-click redirect-to-pickup action.
Public Tracking Page
No authentication required — accessible via /track/{tracking_number} or /track?order={order_id}.
GET /track/{carrier}/{tracking_number}
Response:
{
tracking_number: ...,
carrier: fedex,
status: in_transit,
estimated_delivery: 2025-04-20,
events: [
{ occurred_at: ..., status: in_transit,
description: Departed FedEx facility, location: Memphis TN },
...
]
}
The tracking page response is cached at CDN edge (TTL: 5 min) with cache-key on tracking number. Webhooks bust the cache on status change.
Scalability Considerations
- Webhook ingestion is stateless and horizontally scalable behind a load balancer.
- TrackingEvent writes are append-only — no update contention.
- Polling workers are sharded by carrier to avoid thundering herd against a single carrier API.
- EDD scoring API is cached per shipment for 1 hour to reduce ML inference cost.
Key API Endpoints
POST /shipments -- create shipment record
GET /shipments/{id} -- get shipment + events
GET /track/{carrier}/{tracking_number} -- public tracking (CDN-cached)
POST /webhooks/carrier/{carrier} -- inbound carrier webhooks
GET /shipments?order_id= -- list shipments for order
See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering
See also: Shopify Interview Guide