A real-time location tracking system records device positions, supports geofencing, proximity queries, and stores movement history at scale. Core challenges: high-frequency write throughput (GPS updates every 5 seconds per device), efficient spatial queries without full-table scans, and expiring raw position data while retaining summarized traces.
Core Data Model
-- Raw location updates (high write volume, partition by day)
CREATE TABLE LocationUpdate (
update_id BIGSERIAL,
device_id UUID NOT NULL,
user_id UUID NOT NULL,
lat NUMERIC(9,6) NOT NULL,
lng NUMERIC(9,6) NOT NULL,
accuracy_m SMALLINT,
recorded_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
PRIMARY KEY (update_id, recorded_at)
) PARTITION BY RANGE (recorded_at);
CREATE INDEX idx_location_device_time
ON LocationUpdate (device_id, recorded_at DESC);
-- Geofences
CREATE TABLE Geofence (
fence_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
owner_id UUID NOT NULL,
name TEXT NOT NULL,
center_lat NUMERIC(9,6) NOT NULL,
center_lng NUMERIC(9,6) NOT NULL,
radius_m INT NOT NULL,
is_active BOOLEAN NOT NULL DEFAULT TRUE
);
-- Geofence events (entry/exit)
CREATE TABLE GeofenceEvent (
event_id BIGSERIAL PRIMARY KEY,
fence_id UUID NOT NULL REFERENCES Geofence(fence_id),
device_id UUID NOT NULL,
event_type TEXT NOT NULL CHECK (event_type IN ('enter','exit')),
occurred_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE INDEX idx_geofence_event ON GeofenceEvent (fence_id, occurred_at DESC);
Redis GEO Index for Proximity Queries
Store the latest position of every active device in a Redis GEO set. This enables O(log N) radius queries without touching PostgreSQL.
import redis
from decimal import Decimal
r = redis.Redis(host='redis', port=6379, decode_responses=True)
GEO_KEY = "devices:live" # Redis GEO sorted set
def update_device_location(device_id: str, lat: float, lng: float):
"""Write to Postgres (durable) and Redis GEO (fast queries)."""
# Postgres write is async via job queue; Redis is synchronous for query freshness
r.geoadd(GEO_KEY, [lng, lat, device_id])
# TTL: expire device from live set after 5 minutes of inactivity
r.expire(GEO_KEY, 300) # NOTE: sets TTL on whole key; use per-member approach below
def get_nearby_devices(lat: float, lng: float, radius_m: int) -> list[dict]:
"""Find all devices within radius_m meters of (lat, lng)."""
results = r.geosearch(
GEO_KEY,
longitude=lng,
latitude=lat,
radius=radius_m,
unit='m',
withcoord=True,
withdist=True,
sort='ASC',
count=100
)
# results: [(device_id, distance_m, (lng, lat)), ...]
return [
{"device_id": item[0], "distance_m": float(item[1]),
"lng": item[2][0], "lat": item[2][1]}
for item in results
]
Location Ingestion Pipeline
from dataclasses import dataclass
from datetime import datetime, timezone
import psycopg2
@dataclass
class LocationBatch:
device_id: str
user_id: str
updates: list[tuple[float, float, int, datetime]] # lat, lng, accuracy, ts
DISTANCE_THRESHOLD_M = 10 # only store updates if device moved > 10m
def haversine_m(lat1, lng1, lat2, lng2) -> float:
"""Approximate distance in meters between two GPS points."""
from math import radians, sin, cos, sqrt, atan2
R = 6_371_000
phi1, phi2 = radians(lat1), radians(lat2)
dphi = radians(lat2 - lat1)
dlam = radians(lng2 - lng1)
a = sin(dphi/2)**2 + cos(phi1)*cos(phi2)*sin(dlam/2)**2
return R * 2 * atan2(sqrt(a), sqrt(1-a))
def ingest_location_batch(conn, batch: LocationBatch):
"""
Filter out GPS noise (updates = DISTANCE_THRESHOLD_M:
filtered.append((batch.device_id, batch.user_id, lat, lng, accuracy, ts))
last_lat, last_lng = lat, lng
if not filtered:
return
with conn.cursor() as cur:
cur.executemany(
"""INSERT INTO LocationUpdate
(device_id, user_id, lat, lng, accuracy_m, recorded_at)
VALUES (%s, %s, %s, %s, %s, %s)""",
filtered
)
conn.commit()
# Update Redis GEO with latest position
last = filtered[-1]
update_device_location(last[0], last[2], last[3])
Geofence Evaluation
def check_geofences(conn, device_id: str, lat: float, lng: float):
"""
On each location update, check if the device has entered or exited any geofence.
Efficient: only evaluate fences whose bounding box contains the point.
"""
with conn.cursor() as cur:
# Bounding box pre-filter (avoids computing haversine for every fence)
degree_per_meter = 1.0 / 111_320
cur.execute("""
SELECT fence_id, center_lat, center_lng, radius_m
FROM Geofence
WHERE is_active = TRUE
AND center_lat BETWEEN %s AND %s
AND center_lng BETWEEN %s AND %s
""", (
lat - 0.1, lat + 0.1, # ~11km bounding box
lng - 0.1, lng + 0.1
))
candidates = cur.fetchall()
for fence_id, clat, clng, radius_m in candidates:
dist = haversine_m(lat, lng, clat, clng)
inside_now = dist dict | None:
with conn.cursor() as cur:
cur.execute("""
SELECT event_type FROM GeofenceEvent
WHERE device_id = %s AND fence_id = %s
ORDER BY occurred_at DESC LIMIT 1
""", (device_id, fence_id))
row = cur.fetchone()
return {'event_type': row[0]} if row else None
def record_geofence_event(conn, fence_id: str, device_id: str, event_type: str):
with conn.cursor() as cur:
cur.execute(
"INSERT INTO GeofenceEvent (fence_id, device_id, event_type) VALUES (%s,%s,%s)",
(fence_id, device_id, event_type)
)
conn.commit()
Location History and Data Retention
-- Query movement history for a device over a time range
SELECT lat, lng, accuracy_m, recorded_at
FROM LocationUpdate
WHERE device_id = $1
AND recorded_at BETWEEN $2 AND $3
ORDER BY recorded_at ASC;
-- Partition pruning: Postgres automatically scans only the relevant day partition.
-- Create daily partitions:
-- CREATE TABLE location_updates_2026_04_17 PARTITION OF LocationUpdate
-- FOR VALUES FROM ('2026-04-17') TO ('2026-04-18');
-- Retention policy: drop partitions older than 90 days
-- DROP TABLE location_updates_2026_01_17; (run monthly via pg_cron)
Key Interview Points
- Redis GEO vs PostGIS: Redis GEO is a sorted set using geohash encoding — O(log N) radius queries, sub-millisecond latency, but no persistence guarantee. PostGIS ST_DWithin on indexed geometry columns achieves similar performance for read-heavy workloads with full durability. Use Redis for live device positions; PostGIS for historical spatial analysis.
- GPS noise filtering: Raw GPS updates include jitter of 5–50m. Without filtering, a stationary device generates thousands of meaningless inserts. Threshold: discard updates within 10m of the last stored position. Accuracy_m from the device assists: only trust updates where accuracy_m < 50.
- Geofence scaling: With 10,000 active fences and 1M devices, evaluating all fences per update is O(fences) = slow. Mitigations: (1) spatial index on Geofence.center using PostGIS and ST_DWithin for the bounding-box prefilter; (2) only evaluate fences “near” the device using a geohash prefix lookup; (3) maintain a per-device “active fence” set in Redis to skip re-evaluation when position hasn’t changed significantly.
- Write throughput: At 10 updates/sec/device × 100K devices = 1M writes/sec. Use: Kafka for buffering, batch inserts (executemany), and partitioned tables. Avoid per-row transactions; commit in batches of 1,000.
- Privacy: Location is sensitive. Encrypt at rest (Postgres transparent encryption or column-level encryption for lat/lng). Retention: delete raw location history after 90 days. Anonymize aggregated traces. Require explicit user consent before sharing with third-party services.
Location tracking and GPS data pipeline design is discussed in Uber system design interview questions.
Location tracking and real-time geofencing design is covered in Lyft system design interview preparation.
Location tracking and proximity detection design is discussed in Snap system design interview guide.
See also: Airbnb Interview Guide 2026: Search Systems, Trust and Safety, and Full-Stack Engineering