Geo Search System — Low-Level Design
A geo search system finds nearby entities (restaurants, drivers, listings) given a user’s location. It must handle spatial indexing, radius queries, bounding-box queries, and ranking by distance. This design is asked at Uber, Lyft, Airbnb, and Yelp.
Core Data Model
-- PostgreSQL with PostGIS extension
CREATE EXTENSION IF NOT EXISTS postgis;
Location
id BIGSERIAL PK
entity_id BIGINT NOT NULL -- restaurant, driver, listing ID
entity_type TEXT NOT NULL -- 'restaurant', 'listing'
coordinates GEOMETRY(Point, 4326) NOT NULL -- SRID 4326 = WGS84 lat/lng
city TEXT
updated_at TIMESTAMPTZ
-- Spatial index (required for fast geo queries)
CREATE INDEX idx_location_geo ON Location USING GIST(coordinates);
-- Inserting a point
INSERT INTO Location (entity_id, entity_type, coordinates)
VALUES (42, 'restaurant', ST_SetSRID(ST_MakePoint(-73.9857, 40.7484), 4326));
-- ST_MakePoint(longitude, latitude) — note: longitude first
Radius Query (Find Nearby)
def find_nearby(lat, lng, radius_meters, entity_type, limit=50):
return db.execute("""
SELECT
l.entity_id,
ST_Distance(
l.coordinates::geography,
ST_SetSRID(ST_MakePoint(%(lng)s, %(lat)s), 4326)::geography
) AS distance_meters
FROM Location l
WHERE l.entity_type = %(type)s
AND ST_DWithin(
l.coordinates::geography,
ST_SetSRID(ST_MakePoint(%(lng)s, %(lat)s), 4326)::geography,
%(radius)s
)
ORDER BY distance_meters ASC
LIMIT %(limit)s
""", {'lat': lat, 'lng': lng, 'radius': radius_meters,
'type': entity_type, 'limit': limit})
# ST_DWithin with ::geography uses great-circle distance (accurate for real distances)
# The GIST index makes ST_DWithin an index scan, not a table scan
Geohash-Based Approach (Without PostGIS)
# Geohash: encode lat/lng as a string where shared prefix = proximity
# Precision: 5 chars ≈ 2.4km × 5km cell, 6 chars ≈ 600m × 1.2km cell
import geohash2
def encode_location(lat, lng, precision=6):
return geohash2.encode(lat, lng, precision)
def find_nearby_geohash(lat, lng, radius_km):
center_hash = geohash2.encode(lat, lng, precision=6)
# Get all neighboring cells to avoid edge-of-cell misses
neighbors = geohash2.neighbors(center_hash) + [center_hash]
results = db.execute("""
SELECT entity_id, lat, lng FROM Location
WHERE geohash6 = ANY(%(hashes)s)
""", {'hashes': neighbors})
# Filter precisely by actual distance
user_point = (lat, lng)
nearby = [
r for r in results
if haversine(user_point, (r.lat, r.lng)) <= radius_km
]
return sorted(nearby, key=lambda r: haversine(user_point, (r.lat, r.lng)))
Updating Driver/User Locations (High Write Rate)
# Drivers update location every 5 seconds → 100K drivers = 20K writes/second
# PostgreSQL cannot sustain this; use Redis GEOADD instead
def update_driver_location(driver_id, lat, lng):
# GEOADD stores as a sorted set with geohash score
redis.geoadd('drivers:active', lng, lat, driver_id)
# Set TTL per driver: expire if not updated for 30 seconds
redis.expire(f'driver:active:{driver_id}', 30)
def find_nearby_drivers(lat, lng, radius_meters):
# GEORADIUS: O(N+log M) where N=results, M=total entries
results = redis.georadius(
'drivers:active', lng, lat,
radius_meters / 1000, # km
unit='km',
withcoord=True,
withdist=True,
sort='ASC',
count=10
)
return [{'driver_id': r[0], 'distance_km': r[1], 'coords': r[2]}
for r in results]
Bounding Box Query (Map Viewport)
def find_in_viewport(min_lat, min_lng, max_lat, max_lng, entity_type):
"""Return all entities visible in a map viewport."""
return db.execute("""
SELECT l.entity_id,
ST_Y(l.coordinates) AS lat,
ST_X(l.coordinates) AS lng
FROM Location l
WHERE l.entity_type = %(type)s
AND l.coordinates && ST_MakeEnvelope(
%(min_lng)s, %(min_lat)s,
%(max_lng)s, %(max_lat)s,
4326
)
LIMIT 200
""", {'type': entity_type, 'min_lat': min_lat, 'min_lng': min_lng,
'max_lat': max_lat, 'max_lng': max_lng})
# && operator = bounding box overlap — uses GIST index
Key Interview Points
- PostGIS for static entities, Redis for moving entities: Restaurants don’t move — PostGIS with a GIST index is ideal. Drivers update every 5 seconds — Redis GEOADD/GEORADIUS handles the write rate that would crush Postgres.
- Geohash neighbor lookup prevents edge misses: A user at the edge of a geohash cell may be closer to entities in the neighboring cell than to entities in their own cell. Always query the 8 neighboring cells plus the center.
- geography vs geometry in PostGIS: geometry uses flat-earth math (fast, inaccurate for large distances). geography uses spherical math (accurate great-circle distances, ~20% slower). Use geography for ST_DWithin and ST_Distance when accuracy matters.
- Limit result set before sorting: ST_DWithin with a GIST index returns the candidate set efficiently. Compute exact distance only on the candidates, not the whole table.
{“@context”:”https://schema.org”,”@type”:”FAQPage”,”mainEntity”:[{“@type”:”Question”,”name”:”How does a PostGIS GIST index make geo queries fast?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”A GiST (Generalized Search Tree) index on a geometry column organizes spatial data in a hierarchical bounding-box tree. When ST_DWithin is called with a center point and radius, PostgreSQL traverses the tree to find only the bounding boxes that overlap the search area, then computes exact distances only for the candidates. Without the index, every query is a full table scan computing distance to every row. With the index, the query is O(log N + K) where K is the number of results — even for millions of locations.”}},{“@type”:”Question”,”name”:”Why use Redis GEOADD for driver locations instead of PostgreSQL?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Drivers update their location every 5 seconds. 100,000 active drivers = 20,000 writes/second. PostgreSQL with a GiST index performs ~10,000 writes/second on commodity hardware before contention becomes a problem. Redis GEOADD stores locations in a sorted set using geohash encoding — it handles ~500,000 writes/second. Additionally, GEORADIUS queries on Redis return nearby drivers in milliseconds with no index maintenance overhead. Use PostGIS for static entities (restaurants, listings) and Redis for moving entities (drivers, delivery couriers).”}},{“@type”:”Question”,”name”:”What is the difference between PostGIS geometry and geography types?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”geometry uses flat (Euclidean) coordinates — fast but inaccurate for large distances because the Earth is not flat. A "500 meter radius" query in geometry at the equator differs from the same query at high latitudes. geography uses spherical coordinates (great-circle distance on the WGS84 ellipsoid) — accurate worldwide but ~20% slower. Rule of thumb: use geography for any query where accuracy matters (user-facing "find nearby" search) and geometry only for internal computations where slight inaccuracy is acceptable and performance is critical.”}},{“@type”:”Question”,”name”:”How does geohash neighbor lookup prevent edge-of-cell misses?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”A geohash divides the world into a grid of rectangular cells. A user at the eastern edge of cell "abc123" may be only 10 meters from a restaurant in the adjacent cell "abc124", but a query for only cell "abc123" would miss it. The fix: always query the center cell plus all 8 neighboring cells (N, NE, E, SE, S, SW, W, NW). This ensures no nearby entity is missed due to a cell boundary. After fetching all entities from the 9 cells, apply an exact distance filter to remove entities beyond the radius.”}},{“@type”:”Question”,”name”:”How do you build a bounding-box viewport query for a map interface?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”When the user pans or zooms a map, fetch all entities within the visible viewport. Extract the four corners from the map client: min_lat, max_lat, min_lng, max_lng. In PostGIS: WHERE coordinates && ST_MakeEnvelope(min_lng, min_lat, max_lng, max_lat, 4326). The && operator checks bounding-box overlap and uses the GiST index — much faster than ST_Within (which computes exact containment). Limit the result set (LIMIT 500) to prevent sending thousands of map pins on very wide zoom levels.”}}]}
Geo search and driver location tracking design is discussed in Uber system design interview questions.
Geo search and nearby driver system design is covered in Lyft system design interview preparation.
Geo search and listing proximity design is discussed in Airbnb system design interview guide.