Proximity Service (Find Nearby) Low-Level Design

A proximity service answers “find all restaurants / drivers / users within R km of this location” with low latency and high throughput. It is a common system design interview topic because it requires choosing the right spatial index and handling a high write rate from moving objects.

Requirements

  • Find all entities (restaurants, drivers, venues) within a given radius R of a query location.
  • Update entity locations in real-time (drivers move, delivery workers move).
  • Query latency under 100ms for radius searches.
  • Support millions of entities and hundreds of thousands of location updates per second.

Naive Approach – Why It Fails

Store latitude and longitude columns in a SQL table. Query with a distance filter:

SELECT * FROM locations
WHERE ST_Distance(point(lat, lng), point(query_lat, query_lng)) < radius_meters;

Without a spatial index this is a full table scan – O(n) per query. Even with a B-tree index on lat or lng individually, the 2D range cannot be satisfied efficiently. Does not scale past a few hundred thousand rows.

Geohash

Geohash divides the world into a grid recursively. Each cell is encoded as a base32 string. Longer strings = smaller cells.

  • 4 characters: ~39 km x 20 km cell
  • 6 characters: ~1.2 km x 0.6 km cell
  • 8 characters: ~38 m x 19 m cell

Nearby locations share long common prefixes. To search within radius R:

  1. Compute the geohash cell of the query point at an appropriate precision level.
  2. Query that cell and its 8 neighbors (to avoid edge cases at cell boundaries).
  3. Filter results by exact distance.

Geohash cells can be stored and indexed as plain strings in any database. Simple to implement, works well for uniform point density.

Redis GEO Commands

Redis has built-in geospatial support using geohash under the hood. Key commands:

# Add a location
GEOADD locations 13.361389 38.115556 "restaurant:42"

# Find all members within 5 km, sorted by distance
GEORADIUS locations 15.0 37.0 5 km ASC COUNT 50

# Get position of a member
GEOPOS locations "restaurant:42"

# Get distance between two members
GEODIST locations "restaurant:42" "restaurant:99" km

GEORADIUS is O(log n + m) where m is the number of results returned. All data is stored in a sorted set using the geohash score. This is the go-to solution for proximity queries in practice – it requires zero additional infrastructure if Redis is already in the stack.

Quadtree

A quadtree is an adaptive spatial index. Start with one bounding box covering the entire space. Recursively subdivide any node that contains more than MAX_POINTS entities into 4 quadrants. Keep subdividing until each leaf node has at most MAX_POINTS.

Advantages over geohash:

  • Adaptive to density – dense cities get fine-grained cells, empty areas stay coarse.
  • No fixed cell size to tune.

Disadvantages:

  • More complex to implement and maintain.
  • Updates (insertions, deletions) require rebalancing.
  • Not built into standard databases or caches.

Quadtrees are good when entity density is highly non-uniform and you want the index to self-adapt.

Database with Spatial Index (PostGIS)

For persistent business data, PostGIS on PostgreSQL provides production-grade geospatial queries:

-- Schema
CREATE TABLE locations (
    id BIGINT PRIMARY KEY,
    entity_id BIGINT NOT NULL,
    entity_type VARCHAR(50),
    geom GEOGRAPHY(Point, 4326),
    updated_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE INDEX idx_locations_geom ON locations USING GIST(geom);

-- Radius query: all entities within 2 km
SELECT entity_id, entity_type,
       ST_Distance(geom, ST_MakePoint(query_lng, query_lat)::GEOGRAPHY) AS dist_m
FROM locations
WHERE ST_DWithin(geom, ST_MakePoint(query_lng, query_lat)::GEOGRAPHY, 2000)
ORDER BY dist_m;

ST_DWithin uses the GIST index and is fast. Good when location data needs to co-exist with relational business data and you want complex joins. Not ideal as the primary layer for real-time moving objects.

Read-Write Split Architecture

In practice, separate the hot write path from the query path:

  • Location updates: write to Redis using GEOADD. Fast, in-memory, O(log n) per update.
  • Radius queries: served from Redis using GEORADIUS. Sub-millisecond.
  • Persistent data: asynchronously sync location updates from Redis to PostgreSQL (with PostGIS) for durability, history, and complex queries.
  • Business data (restaurant metadata, menus, ratings): served from a separate relational DB or cache. Joined with location results at the application layer.

This design keeps the hot path entirely in Redis while PostgreSQL handles durability and complex queries.

Handling High Write Throughput for Moving Objects

Moving objects (drivers, delivery workers) generate continuous location updates. Strategies to manage write load:

  • Client-side batching: mobile clients buffer GPS samples and push every 4-10 seconds rather than every second. A driver moving at 50 km/h moves only 55-140 meters in 4 seconds – acceptable accuracy for most use cases.
  • Threshold filtering: only push an update if the device has moved more than X meters since the last update.
  • Write-optimized path: location updates go through a dedicated microservice with a Kafka queue in front. Consumers write to Redis in batches. Smooths out spikes during rush hour.

At Uber scale: 1 million active drivers x 1 update every 4 seconds = 250,000 writes per second on the location layer. Redis handles this comfortably with horizontal sharding (Redis Cluster, shard by geohash prefix or driver_id).

Scale Numbers and Capacity Estimates

Parameter Value
Active drivers (peak) 1,000,000
Location update frequency Every 4 seconds
Write throughput 250,000 writes/sec
Redis memory per driver ~50 bytes (geohash + member name)
Total Redis memory for all drivers ~50 MB
Radius query latency (Redis) < 1 ms
Radius query latency (PostGIS) 1-10 ms with GIST index

Summary – Which Approach to Use

  • Redis GEO: default choice for real-time location, already battle-tested for this exact use case.
  • PostGIS: add when you need persistence, history, or complex geospatial queries beyond radius search.
  • Geohash in SQL: simple to implement if Redis is not available; store as an indexed varchar column.
  • Quadtree: mention as an alternative when density is highly non-uniform; rarely worth implementing from scratch in practice.


{“@context”:”https://schema.org”,”@type”:”FAQPage”,”mainEntity”:[{“@type”:”Question”,”name”:”What is a geohash and how does it enable proximity queries?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Geohash divides the world into a rectangular grid by recursively bisecting latitude and longitude ranges. Each subdivision is encoded as a character (base32). A 6-character geohash represents a ~1.2km x 0.6km cell; 8 characters represent ~38m x 19m. Key property: nearby locations share long common prefixes. To find all locations within radius R: determine the geohash cell for the query point, also check the 8 neighboring cells (same prefix level), and query all records in those cells. The set of 9 cells guarantees coverage if the cell size is appropriate for the radius. Prefix matching: WHERE geohash LIKE 'dr5ru%' scans one cell. String index on geohash makes this O(log n). Redis GEO commands use geohash internally.”}},{“@type”:”Question”,”name”:”How do Redis GEO commands work for proximity search?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Redis GEO stores coordinates as geohash encoded into a sorted set score. Commands: GEOADD key longitude latitude member – stores the member with geohash as score. GEORADIUS key lon lat radius km [WITHCOORD] [WITHDIST] [COUNT n] [ASC] – returns members within the radius, optionally with distances, sorted by distance. GEOPOS key member – retrieves stored coordinates. GEODIST key m1 m2 km – great-circle distance. GEORADIUSBYMEMBER key member radius km – search around an existing member. Time complexity: O(log n + m) where m is result size. For driver tracking: GEOADD drivers lon lat driver_id on each update; GEORADIUS drivers lon lat 5 km COUNT 20 ASC on each ride request.”}},{“@type”:”Question”,”name”:”How do you choose between geohash, quadtree, and spatial database index?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Geohash: best for simple radius queries and Redis-based implementations. O(log n) query with prefix index. Limitation: cells are rectangular, not circular – some false positives at cell boundaries. Good for: location tracking, find-nearby, driver matching. Quadtree: adaptive – subdivides dense areas more finely. Better for: variable-density datasets (city centers vs rural), range queries of arbitrary shapes. Harder to implement than geohash. Spatial database (PostGIS/MySQL spatial): best when location data is joined with other relational data, complex geographic queries needed (polygon containment, road network), or write frequency is low. GIST index supports O(log n) spatial queries. Use PostGIS when you need the full power of SQL with spatial predicates.”}},{“@type”:”Question”,”name”:”How do you scale location tracking for 1M moving objects?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”At 1M drivers updating every 4 seconds = 250K writes/second. Architecture: (1) Client batches GPS updates and sends every 4-10 seconds (not every second). (2) Location update service receives updates and writes to Redis GEOADD – O(log n), Redis handles 100K+ writes/second per instance. (3) For 250K/second: use a Redis cluster, shard by geographic region (each shard handles one city or region). (4) Async persistence: also publish to Kafka, consumers write to a DriverLocation DB table for trip history and analytics. (5) Stale driver filtering: store last_update_time in a Redis hash alongside each driver. GEORADIUS results are filtered to exclude drivers with last_update > 30 seconds ago. Never write every GPS update to the primary DB – it cannot sustain 250K writes/second.”}},{“@type”:”Question”,”name”:”How do you handle the edge case of geohash cell boundaries in proximity search?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”A radius query using a single geohash cell misses nearby points in adjacent cells. Solution: always query the target cell AND the 8 neighboring cells (3×3 grid centered on the target). The 9-cell search guarantees complete coverage for any radius smaller than the cell size. For larger radii: expand to 5×5 or 7×7 grid, or switch to a different spatial index. In Redis: use GEORADIUS which handles this automatically – it computes the appropriate set of geohash cells to cover the search radius. If implementing geohash search manually: compute the geohash of the query point, use a geohash library to get the 8 neighbors, query all 9 cells, filter results by actual distance (some results from neighboring cells may fall outside the radius).”}}]}

Uber uses proximity service for driver matching. See system design patterns for Uber interview: proximity service and driver location design.

Lyft uses geospatial proximity for ride matching. See system design patterns for Lyft interview: geospatial and proximity service design.

Snap uses location-based features and proximity services. See design patterns for Snap interview: location services and proximity design.

Scroll to Top