What Is Topological Sort?
A topological sort of a directed acyclic graph (DAG) produces a linear ordering of vertices such that for every directed edge u → v, u appears before v in the ordering. It is only possible when the graph has no cycles (hence “acyclic”). Applications: course scheduling (take prerequisites before a course), build system dependency resolution (compile dependencies before dependents), task scheduling, and package manager dependency ordering (npm, pip).
Kahn’s Algorithm (BFS-based)
Kahn’s algorithm uses in-degree (number of incoming edges) to iteratively remove nodes with no dependencies:
- Compute in-degree for each node.
- Add all nodes with in-degree = 0 to a queue (no prerequisites).
- While queue is not empty: dequeue a node, add to result. For each neighbor of the dequeued node: decrement their in-degree. If any neighbor’s in-degree becomes 0, enqueue it.
- If result length equals total nodes: valid topological order. Else: the graph has a cycle.
from collections import deque, defaultdict
def topological_sort(n, edges):
graph = defaultdict(list)
in_degree = [0] * n
for u, v in edges:
graph[u].append(v)
in_degree[v] += 1
queue = deque(i for i in range(n) if in_degree[i] == 0)
result = []
while queue:
node = queue.popleft()
result.append(node)
for neighbor in graph[node]:
in_degree[neighbor] -= 1
if in_degree[neighbor] == 0:
queue.append(neighbor)
return result if len(result) == n else [] # empty = cycle detected
Time O(V + E), space O(V + E).
DFS-based Topological Sort
Run DFS on the graph. A node is added to the result stack after all its descendants are visited (post-order). The final result is the reversed stack.
def dfs_topo(n, graph):
visited = [0] * n # 0=unvisited, 1=in-progress, 2=done
result = []
has_cycle = [False]
def dfs(node):
if visited[node] == 1:
has_cycle[0] = True; return
if visited[node] == 2: return
visited[node] = 1
for neighbor in graph[node]:
dfs(neighbor)
visited[node] = 2
result.append(node)
for i in range(n):
if visited[i] == 0:
dfs(i)
return [] if has_cycle[0] else result[::-1]
Three states: 0 (not visited), 1 (currently in DFS stack = in progress), 2 (fully explored). A cycle is detected when a node in state 1 is visited again (a back edge).
Classic Interview Problems
Course Schedule (LC 207): given n courses and prerequisites, determine if all courses can be finished (i.e., no cycle in the prerequisite graph). Build directed graph from prerequisites and run Kahn’s — return True if topological sort length = n.
Course Schedule II (LC 210): return the actual order to take courses. Return the topological sort result from Kahn’s (or DFS).
Alien Dictionary (LC 269): given a list of words sorted in an alien language, find the character ordering. For each adjacent pair of words, find the first differing character — that gives a directed edge (char_in_word1 → char_in_word2). Run topological sort on the resulting graph. If a cycle exists, the ordering is invalid. Edge case: if word1 is longer than word2 and word2 is a prefix of word1, the ordering is invalid (e.g., [“abc”, “ab”] is contradictory).
Parallel Courses III (LC 2050): find the minimum number of semesters to complete all courses, where each semester you can take any courses whose prerequisites are complete. This is the longest path in the DAG. Kahn’s with a twist: process layer by layer (like BFS levels). Each layer = one semester. Track the earliest semester each course can be taken.
Find All Possible Recipes (LC 2115): given recipes with ingredients (some of which are other recipes), find which recipes can be made. Model as a directed graph: ingredient → recipe that needs it. Apply topological sort. A recipe is creatable if all its ingredient nodes have been processed (either raw ingredients or creatable recipes).
Cycle Detection in Directed Graphs
Undirected graph cycle detection: union-find or DFS with parent tracking. Directed graph cycle detection (harder): must detect back edges (edges to ancestors in the current DFS path). Three-color DFS (states 0/1/2 as above) correctly identifies directed cycles. Alternative: Kahn’s algorithm — if topological sort doesn’t include all nodes, a cycle exists among the excluded nodes.
Strongly Connected Components (SCCs)
An SCC is a maximal subgraph where every node is reachable from every other. Kosaraju’s algorithm: (1) Run DFS on original graph, push nodes to a stack in post-order (finish order). (2) Transpose the graph (reverse all edges). (3) Pop from stack and run DFS on transposed graph — each DFS traversal from an unvisited node finds one SCC. Time O(V + E). SCCs partition a directed graph — condensation DAG of SCCs is always a DAG, suitable for topological sort.
Interview Patterns
- “Can all X be completed/ordered?” → topological sort + cycle detection
- “What order should X be done?” → topological sort (return the order)
- “How many steps/levels until all done?” → BFS topological sort, count levels
- “Minimum time if parallel processing allowed?” → longest path in DAG using topological sort order
- Detecting circular imports/dependencies → directed graph cycle detection