System Design Interview: Design a Hotel Booking System (Airbnb/Booking.com)
Hotel booking systems are canonical system design problems testing inventory management, availability checking, concurrent reservation handling, and search at scale. Commonly asked at Airbnb, Booking.com, Expedia, and Uber.
Requirements Clarification
Functional Requirements
- Search hotels/rooms by location, dates, guests, filters (price, stars, amenities)
- View room details, photos, reviews, availability calendar
- Reserve a room for specific dates (prevent double-booking)
- Manage bookings: cancel, modify, view history
- Hotel management: add rooms, set pricing, manage availability blocks
- Reviews and ratings after checkout
Non-Functional Requirements
- Scale: 500M users, 2M hotels, 50M room-nights/year
- No double-booking: two users cannot book same room for overlapping dates
- Search: sub-second results for availability queries
- High read-to-write ratio: 100:1 (browsing vs booking)
- Inventory accuracy: availability shown must be reliable (no phantom availability)
Data Model
CREATE TABLE hotels (
id UUID PRIMARY KEY,
name VARCHAR(255) NOT NULL,
location POINT NOT NULL, -- PostGIS geography type for geo queries
address TEXT,
star_rating SMALLINT CHECK (star_rating BETWEEN 1 AND 5),
amenities TEXT[],
created_at TIMESTAMPTZ DEFAULT NOW()
);
CREATE INDEX idx_hotels_location ON hotels USING GIST(location);
CREATE TABLE rooms (
id UUID PRIMARY KEY,
hotel_id UUID REFERENCES hotels(id),
room_type VARCHAR(100) NOT NULL, -- STANDARD, DELUXE, SUITE
max_guests SMALLINT NOT NULL,
base_price BIGINT NOT NULL, -- cents per night
total_count INT NOT NULL -- physical rooms of this type
);
-- Key table: tracks each individual room's availability
-- Using date-range approach with inventory counts
CREATE TABLE room_inventory (
room_id UUID REFERENCES rooms(id),
date DATE NOT NULL,
total_count INT NOT NULL, -- total rooms of this type
reserved INT NOT NULL DEFAULT 0, -- how many are reserved
PRIMARY KEY (room_id, date)
);
CREATE TABLE bookings (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id UUID NOT NULL,
room_id UUID NOT NULL REFERENCES rooms(id),
hotel_id UUID NOT NULL,
check_in DATE NOT NULL,
check_out DATE NOT NULL, -- exclusive (checkout day not billed)
nights INT GENERATED ALWAYS AS (check_out - check_in) STORED,
guests SMALLINT NOT NULL,
total_amount BIGINT NOT NULL, -- cents
status VARCHAR(20) NOT NULL DEFAULT 'CONFIRMED', -- CONFIRMED/CANCELLED/COMPLETED
idempotency_key VARCHAR(255) UNIQUE,
created_at TIMESTAMPTZ DEFAULT NOW(),
CONSTRAINT valid_dates CHECK (check_out > check_in)
);
CREATE INDEX idx_bookings_user ON bookings(user_id, created_at DESC);
CREATE INDEX idx_bookings_hotel ON bookings(hotel_id, check_in);
Availability Check and Reservation
from datetime import date, timedelta
from typing import Optional
import uuid
def check_availability(room_id: str, check_in: date, check_out: date) -> bool:
"""
Returns True if room is available for all nights in [check_in, check_out).
Uses room_inventory table: available = total_count - reserved.
"""
nights = (check_out - check_in).days
dates = [check_in + timedelta(days=i) for i in range(nights)]
result = db.fetch("""
SELECT COUNT(*) as unavailable_nights
FROM room_inventory
WHERE room_id = $1
AND date = ANY($2::date[])
AND (total_count - reserved) dict:
"""
Reserve room using optimistic locking on room_inventory.
Critical section: check + reserve must be atomic.
Uses SELECT FOR UPDATE to prevent concurrent bookings of same room.
"""
# Check idempotency
existing = Booking.query.filter_by(idempotency_key=idempotency_key).first()
if existing:
return existing.to_dict()
nights = (check_out - check_in).days
dates = [check_in + timedelta(days=i) for i in range(nights)]
with db.transaction():
# Lock inventory rows for this room + date range
# SKIP LOCKED: fail fast if another transaction is booking same room
inventory = db.fetch("""
SELECT date, total_count, reserved
FROM room_inventory
WHERE room_id = $1
AND date = ANY($2::date[])
ORDER BY date
FOR UPDATE
""", room_id, dates)
# Verify availability (under lock)
for row in inventory:
if row['total_count'] - row['reserved'] <= 0:
raise RoomUnavailableError(f"Room unavailable on {row['date']}")
if len(inventory) dict:
"""Cancel booking and release inventory"""
with db.transaction():
booking = db.fetchrow("""
SELECT * FROM bookings WHERE id = $1 AND user_id = $2
FOR UPDATE
""", booking_id, user_id)
if not booking or booking['status'] != 'CONFIRMED':
raise InvalidCancellationError("Booking not found or not cancellable")
if date.today() >= booking['check_in']:
raise InvalidCancellationError("Cannot cancel after check-in")
# Release inventory
nights = (booking['check_out'] - booking['check_in']).days
dates = [booking['check_in'] + timedelta(days=i) for i in range(nights)]
db.execute("""
UPDATE room_inventory
SET reserved = reserved - 1
WHERE room_id = $1 AND date = ANY($2::date[])
""", booking['room_id'], dates)
db.execute("""
UPDATE bookings SET status = 'CANCELLED' WHERE id = $1
""", booking_id)
return {'status': 'CANCELLED', 'booking_id': booking_id}
Search Service: Availability + Geo Search
def search_hotels(lat: float, lon: float, radius_km: float,
check_in: date, check_out: date,
guests: int, filters: dict) -> list:
"""
Two-phase search:
1. Geo filter: find hotels within radius
2. Availability filter: which rooms are available for dates
"""
nights = (check_out - check_in).days
dates = [check_in + timedelta(days=i) for i in range(nights)]
# Phase 1: Geo search using PostGIS (or Elasticsearch geo_distance)
hotels_nearby = db.fetch("""
SELECT h.id, h.name, h.star_rating,
ST_Distance(h.location::geography,
ST_MakePoint($1, $2)::geography) / 1000 as distance_km
FROM hotels h
WHERE ST_DWithin(
h.location::geography,
ST_MakePoint($1, $2)::geography,
$3 * 1000 -- convert km to meters
)
AND ($4::int IS NULL OR h.star_rating >= $4)
ORDER BY distance_km ASC
LIMIT 200
""", lon, lat, radius_km, filters.get('min_stars'))
if not hotels_nearby:
return []
hotel_ids = [h['id'] for h in hotels_nearby]
# Phase 2: Availability check
# Find rooms with sufficient availability for ALL requested dates
available_rooms = db.fetch("""
SELECT r.hotel_id, r.id as room_id, r.room_type, r.base_price,
MIN(ri.total_count - ri.reserved) as min_availability
FROM rooms r
JOIN room_inventory ri ON ri.room_id = r.id
WHERE r.hotel_id = ANY($1::uuid[])
AND r.max_guests >= $2
AND ri.date = ANY($3::date[])
GROUP BY r.hotel_id, r.id, r.room_type, r.base_price
HAVING COUNT(DISTINCT ri.date) = $4 -- all dates must have inventory
AND MIN(ri.total_count - ri.reserved) > 0 -- at least 1 available each night
""", hotel_ids, guests, dates, nights)
# Merge geo results with availability
available_hotel_ids = {r['hotel_id'] for r in available_rooms}
cheapest_per_hotel = {}
for room in available_rooms:
hid = room['hotel_id']
if hid not in cheapest_per_hotel or room['base_price'] < cheapest_per_hotel[hid]['base_price']:
cheapest_per_hotel[hid] = room
results = []
for hotel in hotels_nearby:
if hotel['id'] in available_hotel_ids:
results.append({
**hotel,
'cheapest_room': cheapest_per_hotel[hotel['id']],
'price_per_night': cheapest_per_hotel[hotel['id']]['base_price'],
})
return results
Caching Strategy
import redis
import json
from functools import wraps
redis_client = redis.Redis()
def cache_hotel_details(hotel_id: str, ttl=3600):
"""Hotel static data: name, photos, amenities — cache for 1 hour"""
cache_key = f"hotel:{hotel_id}"
cached = redis_client.get(cache_key)
if cached:
return json.loads(cached)
hotel = fetch_hotel_from_db(hotel_id)
redis_client.setex(cache_key, ttl, json.dumps(hotel))
return hotel
# Availability: cache with SHORT TTL (30s) or don't cache at all
# Stale availability = false bookings = terrible user experience
# Instead: cache search results per (location, dates) for 60 seconds
def cache_search_results(search_params: dict, results: list, ttl=60):
cache_key = f"search:{json.dumps(search_params, sort_keys=True)}"
redis_client.setex(cache_key, ttl, json.dumps(results))
Dynamic Pricing
def calculate_price(base_price: int, room_id: str,
check_in: date, check_out: date) -> int:
"""
Dynamic pricing factors:
- Occupancy: higher price as availability decreases
- Seasonality: holidays, events increase price
- Lead time: last-minute bookings may be discounted
- Day of week: weekends typically higher
"""
nights = (check_out - check_in).days
# Occupancy multiplier
availability = get_min_availability(room_id, check_in, check_out)
total = get_total_rooms(room_id)
occupancy_rate = 1 - (availability / total)
occupancy_multiplier = 1.0 + (occupancy_rate ** 2) * 0.5 # up to 1.5x at full
# Seasonal events (stored in events table)
event_multiplier = get_event_multiplier(check_in, check_out)
# Lead time discount (within 48 hours: 10% off)
days_until = (check_in - date.today()).days
lead_multiplier = 0.9 if days_until <= 2 else 1.0
total_price = int(base_price * nights * occupancy_multiplier *
event_multiplier * lead_multiplier)
return total_price
Key Design Decisions
- Room inventory per-date vs date range: Per-date rows in room_inventory allow fine-grained availability and pricing. Alternative: store bookings and compute availability by counting overlapping reservations — simpler but slower for availability queries at scale.
- SELECT FOR UPDATE prevents double-booking: Locking inventory rows during check+reserve ensures atomicity. Alternative: optimistic locking with version numbers (increment version, fail if version changed between read and write).
- Elasticsearch for search, PostgreSQL for reservations: Elasticsearch handles geospatial search, faceted filters, and text search. PostgreSQL handles the transactional reservation logic with ACID guarantees. Two-phase approach: Elasticsearch for hotel discovery, PostgreSQL for availability verification.
- Separate read model (search) from write model (reservation): Search results are pre-indexed in Elasticsearch updated asynchronously. Reservations always hit PostgreSQL for consistency. Accept that search results may show hotels that become fully booked in the time between search and booking (handle gracefully with availability check at booking time).
{“@context”:”https://schema.org”,”@type”:”FAQPage”,”mainEntity”:[{“@type”:”Question”,”name”:”How do you prevent double-booking in a hotel reservation system?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”The critical section for preventing double-booking is the check-availability-and-reserve operation, which must be atomic. Use SELECT FOR UPDATE in a database transaction to lock the room_inventory rows for the requested dates before checking and updating. This prevents concurrent reservations of the same room/dates. Two users attempting to book simultaneously: one transaction gets the lock, checks availability, decrements reserved count, and commits; the other transaction waits, then retries with updated counts, and gets rejected if the room is now full. Alternative: optimistic locking with version numbers (check version, update where version matches, retry on conflict).”}},{“@type”:”Question”,”name”:”How do you model room availability for a hotel system?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Two main approaches: (1) Inventory-per-date: maintain a room_inventory table with (room_id, date, total_count, reserved) rows — one row per room type per date. Fast availability queries, easy to handle pricing per date, scales well. (2) Booking-overlap: store only confirmed bookings; compute availability by counting bookings overlapping the requested date range. Simpler schema but slower for availability queries at scale (requires joining and counting). For high-traffic systems like Airbnb with millions of queries, the inventory-per-date approach with proper indexing is preferred.”}},{“@type”:”Question”,”name”:”How does the search work in a hotel booking system at scale?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Two-phase search: (1) Geo filter — use PostGIS ST_DWithin or Elasticsearch geo_distance to find hotels within a radius of the requested location. Returns candidate hotel IDs. (2) Availability filter — for those candidate hotels, query room_inventory for all nights in the date range, grouped by room, checking that MIN(available) > 0 across all requested dates. To scale: pre-index search results in Elasticsearch (updated asynchronously as bookings are made), use read replicas for availability queries, and cache search results for 60 seconds per (location, dates) combination (accept slight staleness vs accuracy).”}},{“@type”:”Question”,”name”:”How do you implement dynamic pricing for a hotel system?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Dynamic pricing multipliers applied to base price: (1) Occupancy rate — as rooms fill up, price increases (e.g., occupancy^2 * 0.5x additional). (2) Seasonality — holidays, local events, weekends command premiums; stored in an events table. (3) Lead time — last-minute bookings may be discounted to fill remaining inventory. (4) Competitor pricing — scrape competitor prices (Airbnb, Booking.com) and price competitively. In practice, hotels provide their own pricing via a Property Management System (PMS); the booking platform aggregates these rates. Machine learning models (gradient boosting) optimize prices by predicting demand at different price points.”}},{“@type”:”Question”,”name”:”What caching strategy should you use for a hotel booking system?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Tiered caching: (1) Hotel static data (name, photos, amenities, location): long TTL (1 hour) in Redis — rarely changes. (2) Room base pricing: medium TTL (15 minutes) — changes daily at most. (3) Availability data: very short TTL (30 seconds) or no cache — staleness causes bad UX when users see available rooms that are actually booked. (4) Search results per (location, dates, filters): 60-second TTL — accept slight staleness to handle traffic spikes. (5) For real-time availability on the booking confirmation page: always read from primary database, no cache. Never sacrifice booking accuracy for cache performance.”}}]}