What Is a Flight Search Service?
A flight search service aggregates schedules and prices from airlines or GDS providers, lets users filter by route, date, and cabin class, and returns ranked itineraries. The low-level design must handle large data volumes, complex multi-leg route graphs, and near-real-time price changes.
Data Model
-- Core tables (SQL pseudocode)
CREATE TABLE airports (
airport_code CHAR(3) PRIMARY KEY, -- IATA code
city VARCHAR(128),
country_code CHAR(2),
timezone VARCHAR(64)
);
CREATE TABLE airlines (
airline_code CHAR(2) PRIMARY KEY,
name VARCHAR(128)
);
CREATE TABLE flights (
flight_id BIGINT PRIMARY KEY,
airline_code CHAR(2) REFERENCES airlines(airline_code),
flight_number VARCHAR(8),
origin CHAR(3) REFERENCES airports(airport_code),
destination CHAR(3) REFERENCES airports(airport_code),
depart_time TIME,
arrive_time TIME,
duration_min INT,
days_of_week TINYINT -- bitmask: bit 0=Mon ... bit 6=Sun
);
CREATE TABLE flight_inventory (
inventory_id BIGINT PRIMARY KEY,
flight_id BIGINT REFERENCES flights(flight_id),
travel_date DATE,
cabin_class VARCHAR(16), -- 'ECONOMY', 'BUSINESS', 'FIRST'
seats_total INT,
seats_available INT,
base_fare DECIMAL(10,2),
currency CHAR(3),
UNIQUE (flight_id, travel_date, cabin_class)
);
CREATE TABLE search_cache (
cache_key VARCHAR(128) PRIMARY KEY,
results_json MEDIUMTEXT,
cached_at TIMESTAMP,
expires_at TIMESTAMP
);
Core Algorithm: Route Search and Ranking
Flight search is fundamentally a shortest-path problem on a directed graph where nodes are airports and edges are flights on a given date.
- Direct flights: Query
flight_inventorywhereorigin = :origin AND destination = :dest AND travel_date = :date AND seats_available > 0. - One-stop itineraries: Use a two-hop graph traversal. Find all flights from origin to any intermediate hub, then all flights from that hub to the destination arriving after a minimum connection window (e.g., 45 minutes).
- Ranking: Score each itinerary by a weighted formula combining price, total duration, number of stops, and departure time preference. Return the top-N results.
- Price assembly: Total fare = base fare + taxes + carrier surcharges. Taxes are looked up from a
tax_rulestable keyed by origin/destination country pair.
Aggregation from External Sources
For a meta-search model (aggregating multiple airlines), the service fans out queries to upstream APIs in parallel, normalizes responses into the internal schema, merges and deduplicates itineraries, and returns results. A circuit breaker per upstream prevents one slow provider from delaying the entire response. Partial results are returned with a warning if some providers time out.
Failure Handling
- Stale prices: Cache results for no more than 5 minutes. Display a disclaimer and force a re-check before booking confirmation.
- Inventory gone at booking: After search, seats may sell out. The booking flow must re-validate availability before charging; return a user-friendly error and suggest alternatives if unavailable.
- Upstream timeouts: Each provider call has a 3-second deadline. Results from providers that respond in time are shown; timed-out providers are retried asynchronously and pushed via WebSocket if the user is still on the results page.
- Data inconsistency: Incoming schedule updates from airlines are applied in batches with an upsert strategy to avoid gaps between delete and re-insert.
Scalability Considerations
- Search index: Pre-compute and index popular route pairs in Elasticsearch or a dedicated search store to avoid full table scans on every query.
- Fare cache: Store pre-searched fares for the next 90 days for the top 500 routes in Redis. Refresh nightly via a batch job.
- Read/write separation: Inventory updates from airline feeds go to the write database; search queries hit read replicas or the search index.
- Horizontal scaling: The search service is stateless; add instances behind a load balancer to handle traffic spikes around holiday booking peaks.
- Queue-based ingestion: Airline schedule and price updates arrive via Kafka topics; consumer workers update inventory asynchronously without blocking search.
Summary
Flight search is a graph traversal problem layered on top of an aggregation challenge. The key design decisions are: represent routes as a graph for flexible multi-hop search, cache aggressively at the route level, fan out to providers in parallel with strict timeouts, and re-validate prices at booking time to handle the gap between search and purchase.
{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “What are the main components of a flight search system design?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “A flight search system consists of a search aggregation layer, a fare/availability indexing service, an airline data ingestion pipeline, a ranking and filtering engine, and a caching layer. The aggregation layer fans out queries to multiple airline APIs or a shared inventory cache, collects results, deduplicates, and returns ranked itineraries.”
}
},
{
“@type”: “Question”,
“name”: “How do you handle the massive volume of flight fare data in real time?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Fares are ingested via push feeds (e.g., ATPCO or direct airline APIs) into a streaming pipeline like Kafka. A consumer service indexes fares into an in-memory or columnar store (e.g., Elasticsearch or Apache Druid) optimized for range queries on price, date, and origin-destination pairs. Stale fares are evicted with TTL-based expiry.”
}
},
{
“@type”: “Question”,
“name”: “How do you design a flight search to return results in under one second?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Sub-second search is achieved by pre-indexing popular route and date combinations, serving cached results for common queries, and using a scatter-gather pattern with a strict deadline (e.g., 800 ms) to collect airline responses. Results arriving after the deadline are discarded. Progressive loading can render partial results while remaining sources are still responding.”
}
},
{
“@type”: “Question”,
“name”: “How are multi-city and connecting flight itineraries generated efficiently?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Connecting itineraries are generated using a graph traversal (BFS or Dijkstra’s) over an airport connectivity graph where edges carry price and duration. To bound complexity, the graph is pruned to a limited number of hub airports and layover windows. Pre-computed shortest paths for popular city pairs are cached and refreshed nightly.”
}
}
]
}
See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering
See also: Airbnb Interview Guide 2026: Search Systems, Trust and Safety, and Full-Stack Engineering