A proximity service answers “find all restaurants / drivers / users within R km of this location” with low latency and high throughput. It is a common system design interview topic because it requires choosing the right spatial index and handling a high write rate from moving objects.
Requirements
- Find all entities (restaurants, drivers, venues) within a given radius R of a query location.
- Update entity locations in real-time (drivers move, delivery workers move).
- Query latency under 100ms for radius searches.
- Support millions of entities and hundreds of thousands of location updates per second.
Naive Approach – Why It Fails
Store latitude and longitude columns in a SQL table. Query with a distance filter:
SELECT * FROM locations
WHERE ST_Distance(point(lat, lng), point(query_lat, query_lng)) < radius_meters;
Without a spatial index this is a full table scan – O(n) per query. Even with a B-tree index on lat or lng individually, the 2D range cannot be satisfied efficiently. Does not scale past a few hundred thousand rows.
Geohash
Geohash divides the world into a grid recursively. Each cell is encoded as a base32 string. Longer strings = smaller cells.
- 4 characters: ~39 km x 20 km cell
- 6 characters: ~1.2 km x 0.6 km cell
- 8 characters: ~38 m x 19 m cell
Nearby locations share long common prefixes. To search within radius R:
- Compute the geohash cell of the query point at an appropriate precision level.
- Query that cell and its 8 neighbors (to avoid edge cases at cell boundaries).
- Filter results by exact distance.
Geohash cells can be stored and indexed as plain strings in any database. Simple to implement, works well for uniform point density.
Redis GEO Commands
Redis has built-in geospatial support using geohash under the hood. Key commands:
# Add a location
GEOADD locations 13.361389 38.115556 "restaurant:42"
# Find all members within 5 km, sorted by distance
GEORADIUS locations 15.0 37.0 5 km ASC COUNT 50
# Get position of a member
GEOPOS locations "restaurant:42"
# Get distance between two members
GEODIST locations "restaurant:42" "restaurant:99" km
GEORADIUS is O(log n + m) where m is the number of results returned. All data is stored in a sorted set using the geohash score. This is the go-to solution for proximity queries in practice – it requires zero additional infrastructure if Redis is already in the stack.
Quadtree
A quadtree is an adaptive spatial index. Start with one bounding box covering the entire space. Recursively subdivide any node that contains more than MAX_POINTS entities into 4 quadrants. Keep subdividing until each leaf node has at most MAX_POINTS.
Advantages over geohash:
- Adaptive to density – dense cities get fine-grained cells, empty areas stay coarse.
- No fixed cell size to tune.
Disadvantages:
- More complex to implement and maintain.
- Updates (insertions, deletions) require rebalancing.
- Not built into standard databases or caches.
Quadtrees are good when entity density is highly non-uniform and you want the index to self-adapt.
Database with Spatial Index (PostGIS)
For persistent business data, PostGIS on PostgreSQL provides production-grade geospatial queries:
-- Schema
CREATE TABLE locations (
id BIGINT PRIMARY KEY,
entity_id BIGINT NOT NULL,
entity_type VARCHAR(50),
geom GEOGRAPHY(Point, 4326),
updated_at TIMESTAMPTZ DEFAULT NOW()
);
CREATE INDEX idx_locations_geom ON locations USING GIST(geom);
-- Radius query: all entities within 2 km
SELECT entity_id, entity_type,
ST_Distance(geom, ST_MakePoint(query_lng, query_lat)::GEOGRAPHY) AS dist_m
FROM locations
WHERE ST_DWithin(geom, ST_MakePoint(query_lng, query_lat)::GEOGRAPHY, 2000)
ORDER BY dist_m;
ST_DWithin uses the GIST index and is fast. Good when location data needs to co-exist with relational business data and you want complex joins. Not ideal as the primary layer for real-time moving objects.
Read-Write Split Architecture
In practice, separate the hot write path from the query path:
- Location updates: write to Redis using
GEOADD. Fast, in-memory, O(log n) per update. - Radius queries: served from Redis using
GEORADIUS. Sub-millisecond. - Persistent data: asynchronously sync location updates from Redis to PostgreSQL (with PostGIS) for durability, history, and complex queries.
- Business data (restaurant metadata, menus, ratings): served from a separate relational DB or cache. Joined with location results at the application layer.
This design keeps the hot path entirely in Redis while PostgreSQL handles durability and complex queries.
Handling High Write Throughput for Moving Objects
Moving objects (drivers, delivery workers) generate continuous location updates. Strategies to manage write load:
- Client-side batching: mobile clients buffer GPS samples and push every 4-10 seconds rather than every second. A driver moving at 50 km/h moves only 55-140 meters in 4 seconds – acceptable accuracy for most use cases.
- Threshold filtering: only push an update if the device has moved more than X meters since the last update.
- Write-optimized path: location updates go through a dedicated microservice with a Kafka queue in front. Consumers write to Redis in batches. Smooths out spikes during rush hour.
At Uber scale: 1 million active drivers x 1 update every 4 seconds = 250,000 writes per second on the location layer. Redis handles this comfortably with horizontal sharding (Redis Cluster, shard by geohash prefix or driver_id).
Scale Numbers and Capacity Estimates
| Parameter | Value |
|---|---|
| Active drivers (peak) | 1,000,000 |
| Location update frequency | Every 4 seconds |
| Write throughput | 250,000 writes/sec |
| Redis memory per driver | ~50 bytes (geohash + member name) |
| Total Redis memory for all drivers | ~50 MB |
| Radius query latency (Redis) | < 1 ms |
| Radius query latency (PostGIS) | 1-10 ms with GIST index |
Summary – Which Approach to Use
- Redis GEO: default choice for real-time location, already battle-tested for this exact use case.
- PostGIS: add when you need persistence, history, or complex geospatial queries beyond radius search.
- Geohash in SQL: simple to implement if Redis is not available; store as an indexed varchar column.
- Quadtree: mention as an alternative when density is highly non-uniform; rarely worth implementing from scratch in practice.
Uber uses proximity service for driver matching. See system design patterns for Uber interview: proximity service and driver location design.
Lyft uses geospatial proximity for ride matching. See system design patterns for Lyft interview: geospatial and proximity service design.
Snap uses location-based features and proximity services. See design patterns for Snap interview: location services and proximity design.