Shortest path problems appear in nearly every graph interview. The key skill is recognizing which algorithm applies to the given graph type – the wrong choice either produces wrong answers (e.g., Dijkstra on negative weights) or runs too slowly. This guide covers every major shortest path algorithm with Python code and a pattern guide.
BFS for Unweighted Shortest Path
When all edge weights are equal (or there are no weights), BFS finds the shortest path in O(V + E). BFS guarantees shortest path because it explores nodes level by level – the first time a node is reached, it is reached via the shortest path.
from collections import deque
def bfs_shortest(graph, start, end):
"""graph: adjacency list {node: [neighbors]}"""
if start == end:
return 0
visited = {start}
queue = deque([(start, 0)])
while queue:
node, dist = queue.popleft()
for neighbor in graph[node]:
if neighbor == end:
return dist + 1
if neighbor not in visited:
visited.add(neighbor)
queue.append((neighbor, dist + 1))
return -1 # unreachable
def bfs_all_distances(graph, start):
"""Returns distances from start to all reachable nodes"""
dist = {start: 0}
queue = deque([start])
while queue:
node = queue.popleft()
for neighbor in graph[node]:
if neighbor not in dist:
dist[neighbor] = dist[node] + 1
queue.append(neighbor)
return dist
When to use: grid problems, word ladders, level-by-level exploration, any graph where edges are unweighted or all have equal weight.
Dijkstra with Min-Heap
Dijkstra finds the shortest path from a source to all other nodes in a weighted graph with non-negative weights. Time complexity: O((V + E) log V) with a binary heap.
import heapq
def dijkstra(graph, start):
"""
graph: {node: [(cost, neighbor), ...]}
Returns: dist dict with shortest distances from start
"""
dist = {start: 0}
heap = [(0, start)] # (cost, node)
while heap:
cost, node = heapq.heappop(heap)
# Lazy deletion: skip if we have already found a better path
if cost > dist.get(node, float('inf')):
continue
for edge_cost, neighbor in graph[node]:
new_cost = cost + edge_cost
if new_cost dist.get(node, float('inf')):
continue
for edge_cost, neighbor in graph[node]:
new_cost = cost + edge_cost
if new_cost < dist.get(neighbor, float('inf')):
dist[neighbor] = new_cost
prev[neighbor] = node
heapq.heappush(heap, (new_cost, neighbor))
# Reconstruct path
path = []
node = end
while node is not None:
path.append(node)
node = prev.get(node)
return dist.get(end, float('inf')), path[::-1]
The lazy deletion trick: instead of updating a node’s priority in the heap (complex), push a new entry and skip stale ones when popped. This makes the implementation much simpler at the cost of some extra heap entries.
Critical constraint: Dijkstra FAILS with negative edge weights. A single negative edge can cause it to finalize a node with a non-optimal distance.
Bellman-Ford
Bellman-Ford handles negative edge weights and can detect negative cycles. Time complexity: O(V * E).
def bellman_ford(edges, num_nodes, start):
"""
edges: [(cost, u, v), ...]
Returns: (dist dict, has_negative_cycle bool)
"""
dist = {i: float('inf') for i in range(num_nodes)}
dist[start] = 0
# Relax all edges V-1 times
for _ in range(num_nodes - 1):
updated = False
for cost, u, v in edges:
if dist[u] != float('inf') and dist[u] + cost < dist[v]:
dist[v] = dist[u] + cost
updated = True
if not updated:
break # early termination
# Check for negative cycles: if any edge can still be relaxed, cycle exists
has_negative_cycle = False
for cost, u, v in edges:
if dist[u] != float('inf') and dist[u] + cost < dist[v]:
has_negative_cycle = True
break
return dist, has_negative_cycle
Why V-1 relaxations? The shortest path in a graph with V nodes has at most V-1 edges. Each relaxation pass guarantees that paths of increasing length are correctly computed.
Negative cycle detection: After V-1 passes, any edge that can still be relaxed is part of (or reachable from) a negative cycle, since a shortest path cannot get shorter indefinitely.
When to use: when the graph has negative edge weights but you need to detect negative cycles. Also used in currency arbitrage detection.
Floyd-Warshall
Floyd-Warshall computes shortest paths between ALL pairs of nodes in O(V^3) time and O(V^2) space. It uses dynamic programming.
def floyd_warshall(n, edges):
"""
n: number of nodes (0-indexed)
edges: [(u, v, cost), ...]
Returns: dist[i][j] = shortest path from i to j
"""
INF = float('inf')
dist = [[INF] * n for _ in range(n)]
for i in range(n):
dist[i][i] = 0
for u, v, cost in edges:
dist[u][v] = min(dist[u][v], cost)
# For undirected graphs: dist[v][u] = min(dist[v][u], cost)
# DP recurrence: dist[i][j] = min(dist[i][j], dist[i][k] + dist[k][j])
for k in range(n): # intermediate node
for i in range(n): # source
for j in range(n): # destination
if dist[i][k] + dist[k][j] < dist[i][j]:
dist[i][j] = dist[i][k] + dist[k][j]
# Detect negative cycles: if dist[i][i] < 0, node i is in a negative cycle
has_neg_cycle = any(dist[i][i] < 0 for i in range(n))
return dist, has_neg_cycle
The recurrence interprets as: “what is the shortest path from i to j using only nodes 0..k as intermediaries?” Adding node k as a potential intermediate at each outer iteration.
When to use: small dense graphs (V <= 400 typically), all-pairs queries, or when the simplicity of the algorithm is worth the O(V^3) cost.
Topological Sort for DAGs
For Directed Acyclic Graphs (DAGs), dynamic programming over a topological order finds shortest (and longest) paths in O(V + E) – faster than Dijkstra and without the non-negative constraint.
from collections import deque
def dag_shortest_path(graph, weights, start, n):
"""
graph: {u: [v, ...]} adjacency list
weights: {(u, v): cost} edge weights
Returns dist from start to all nodes
"""
# Topological sort (Kahn's algorithm)
in_degree = [0] * n
for u in graph:
for v in graph[u]:
in_degree[v] += 1
queue = deque(i for i in range(n) if in_degree[i] == 0)
topo = []
while queue:
u = queue.popleft()
topo.append(u)
for v in graph.get(u, []):
in_degree[v] -= 1
if in_degree[v] == 0:
queue.append(v)
# DP in topological order
dist = [float('inf')] * n
dist[start] = 0
for u in topo:
if dist[u] == float('inf'):
continue
for v in graph.get(u, []):
w = weights.get((u, v), 1)
if dist[u] + w < dist[v]:
dist[v] = dist[u] + w
return dist
For longest path (critical path in project scheduling), initialize to negative infinity and use max instead of min. No need to worry about negative cycles in a DAG.
A* Heuristic Search
A* is Dijkstra guided by a heuristic. It prioritizes nodes that are likely closer to the goal, reducing explored nodes in practice.
import heapq
import math
def astar(graph, start, end, heuristic):
"""
heuristic(node, end) must be admissible: never overestimates true cost
f = g + h, where g = actual cost from start, h = heuristic to end
"""
g = {start: 0}
heap = [(heuristic(start, end), 0, start)] # (f, g, node)
while heap:
f, g_cost, node = heapq.heappop(heap)
if node == end:
return g_cost
if g_cost > g.get(node, float('inf')):
continue
for edge_cost, neighbor in graph[node]:
new_g = g_cost + edge_cost
if new_g < g.get(neighbor, float('inf')):
g[neighbor] = new_g
h = heuristic(neighbor, end)
heapq.heappush(heap, (new_g + h, new_g, neighbor))
return float('inf') # unreachable
# For grid graphs: Manhattan distance heuristic
def manhattan(a, b):
return abs(a[0] - b[0]) + abs(a[1] - b[1])
# For geographic maps: Haversine distance (straight-line geographic distance)
def haversine(coord1, coord2):
lat1, lon1 = math.radians(coord1[0]), math.radians(coord1[1])
lat2, lon2 = math.radians(coord2[0]), math.radians(coord2[1])
dlat = lat2 - lat1
dlon = lon2 - lon1
a = math.sin(dlat/2)**2 + math.cos(lat1)*math.cos(lat2)*math.sin(dlon/2)**2
return 6371 * 2 * math.asin(math.sqrt(a)) # km
An admissible heuristic never overestimates the true cost. A consistent (monotone) heuristic additionally satisfies h(n) <= cost(n, n') + h(n') for all edges, which guarantees A* never re-expands a node.
When A* beats Dijkstra: in large sparse graphs where the goal is a specific target and a good heuristic is available (maps, game pathfinding). For general all-nodes shortest path, use Dijkstra.
Bidirectional Dijkstra
Bidirectional Dijkstra runs two simultaneous searches – one from the source forward and one from the destination backward. They meet in the middle, reducing the search space significantly.
Forward search: from source, expanding outward
Backward search: from destination, expanding inward (on reversed graph)
Termination: when a node is settled by both searches
Best path: min over all nodes n of (dist_fwd[n] + dist_bwd[n])
Expected speedup: roughly 2x on uniform graphs (searches cover half the radius each)
Real-world speedup: much greater with A* heuristics (used in Google Maps, OSM routing)
Common Interview Questions
Q: Can Dijkstra detect a negative cycle?
No. Dijkstra may loop infinitely or produce wrong results with negative edges. Use Bellman-Ford for negative cycle detection.
Q: What if the graph has negative edges but guaranteed no negative cycles?
Use Bellman-Ford. Dijkstra still fails on negative edges even without cycles – it finalizes nodes greedily assuming the current shortest path cannot improve, which is wrong with negative edges.
Q: Why does BFS guarantee shortest path in unweighted graphs?
BFS explores nodes in non-decreasing order of hop count. A node is first reached at its minimum hop distance. Adding weights breaks this property unless all weights are equal.
Q: When is Floyd-Warshall better than running Dijkstra from every node?
Floyd-Warshall: O(V^3), simple implementation, handles negative weights (detects negative cycles). Dijkstra from every node: O(V * (V+E) log V), faster for sparse graphs, requires non-negative weights. For dense graphs (E ~ V^2), Floyd-Warshall is comparable and simpler.
Algorithm Selection Guide
| Graph Type | Goal | Algorithm | Complexity |
|---|---|---|---|
| Unweighted | Single source | BFS | O(V+E) |
| Non-negative weights | Single source | Dijkstra | O((V+E) log V) |
| Non-negative weights, known target | Single target | A* | O((V+E) log V) best case better |
| Negative weights, no negative cycle | Single source | Bellman-Ford | O(VE) |
| Any weights | All-pairs | Floyd-Warshall | O(V^3) |
| DAG (any weights) | Single source | Topo sort + DP | O(V+E) |
| Any – detect negative cycle | Cycle detection | Bellman-Ford | O(VE) |
Key LeetCode Problems
- 743 – Network Delay Time: classic Dijkstra single-source, find max distance to all nodes.
- 787 – Cheapest Flights Within K Stops: Bellman-Ford variant with K relaxation rounds, or modified Dijkstra with (cost, node, stops) state.
- 1334 – Find the City With the Smallest Number of Neighbors at a Threshold Distance: Floyd-Warshall all-pairs, then count reachable cities per node.
- 1514 – Path with Maximum Probability: Dijkstra variant maximizing product of probabilities (negate log for additive weights).
- 1631 – Path With Minimum Effort: Dijkstra on grid where edge cost is the absolute height difference.
- 778 – Swim in Rising Water: Binary search on time + BFS, or Dijkstra minimizing the maximum edge weight on path.
The most common interview mistake is applying Dijkstra to a graph with negative weights. Always check the constraints. If edges can be negative, reach for Bellman-Ford or ask whether negative cycles are possible.
{“@context”:”https://schema.org”,”@type”:”FAQPage”,”mainEntity”:[{“@type”:”Question”,”name”:”When should you use BFS vs Dijkstra for shortest path?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Use BFS when the graph is unweighted (or all edge weights are equal). BFS guarantees shortest path in O(V+E) because it explores nodes in order of their distance (number of hops). Use Dijkstra when edge weights are non-negative and vary. Dijkstra uses a min-heap to always process the nearest unvisited node, running in O((V+E) log V). If all edges have weight 1, Dijkstra degenerates to BFS. Never use Dijkstra on graphs with negative weights – it may produce wrong answers. For graphs with a mix of 0 and 1 weights, use 0-1 BFS (deque: push weight-0 neighbors to front, weight-1 to back) which runs in O(V+E) without a heap.”}},{“@type”:”Question”,”name”:”How does Dijkstra work and what is the lazy deletion optimization?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Dijkstra initializes dist[source] = 0, all others = infinity. It uses a min-heap of (distance, node). On each step: pop the smallest (dist, node). If dist > dist[node], skip (already found a shorter path – lazy deletion). Otherwise, for each neighbor: if dist + edge_weight < dist[neighbor], update dist[neighbor] and push (dist[neighbor], neighbor) to the heap. The lazy deletion approach allows duplicate entries in the heap – the outdated ones are simply ignored when popped. This avoids the need for a decrease-key operation, making it easy to implement with a standard binary heap. Python: use heapq.heappush and check if popped dist matches current dist[node].”}},{“@type”:”Question”,”name”:”When do you need Bellman-Ford instead of Dijkstra?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Use Bellman-Ford when the graph has negative edge weights. Dijkstra fails with negative weights because once a node is finalized, Dijkstra assumes no shorter path exists – but a negative edge later in the path could reduce it. Bellman-Ford relaxes all edges V-1 times (sufficient to find shortest paths in a V-node graph with no negative cycles). If any distance improves on the Vth relaxation, a negative cycle exists. Use case: detecting arbitrage in currency exchange (negative log of exchange rates = negative cycle in the log-weight graph). Bellman-Ford runs in O(VE), much slower than Dijkstra for large graphs.”}},{“@type”:”Question”,”name”:”What is Floyd-Warshall and when is it preferred over Dijkstra?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Floyd-Warshall computes shortest paths between ALL pairs of nodes in O(V^3). Recurrence: dist[i][j] = min(dist[i][j], dist[i][k] + dist[k][j]) for each intermediate node k. Initialize with direct edge weights and infinity for non-edges. After the V iterations over k, dist[i][j] holds the shortest path from i to j. Prefer Floyd-Warshall over running Dijkstra V times when: V is small (V <= 500), the graph is dense, or negative weights exist (Bellman-Ford is slower). For detecting negative cycles: if dist[i][i] < 0 after Floyd-Warshall, a negative cycle exists on the path from i to i.”}},{“@type”:”Question”,”name”:”How do you find the shortest path in a DAG (Directed Acyclic Graph)?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Process nodes in topological order and relax edges. No need for a heap – one O(V+E) pass suffices. Algorithm: topologically sort the DAG, initialize dist[source] = 0 and all others = infinity. For each node u in topological order: for each neighbor v, if dist[u] + weight(u,v) < dist[v], update dist[v]. Works with negative weights (no negative cycles possible in a DAG). Also works for longest path: negate all weights and find shortest path. Critical path in project scheduling (tasks as nodes, dependencies as edges) is the classic application of longest path in a DAG.”}}]}
Uber system design interviews cover routing and shortest path algorithms. See patterns for Uber interview: shortest path and routing algorithm design.
Lyft system design covers routing algorithms and graph traversal. See patterns for Lyft interview: graph routing and shortest path algorithms.
LinkedIn interviews test graph algorithms including shortest path and social graph traversal. See patterns for LinkedIn interview: graph algorithms and shortest path.