Topological sort is the graph algorithm you need for scheduling problems: course prerequisites, build systems, task dependencies, and package managers. It only works on Directed Acyclic Graphs (DAGs) and there are two main approaches — each with different strengths.
What Is Topological Sort?
Given a DAG, produce a linear ordering of vertices such that for every directed edge (u → v), vertex u comes before v in the ordering.
Example: if course A is a prerequisite for course B, A must come before B in the schedule.
Approach 1: DFS Post-Order (Depth-First)
def topological_sort_dfs(graph: dict) -> list | None:
"""
DFS-based topological sort.
A node goes into the result AFTER all its dependencies are processed (post-order).
Reverse the result to get the topological order.
Returns None if graph has a cycle (no valid topological order exists).
Time: O(V + E), Space: O(V)
"""
visited = set()
in_stack = set() # Detect cycles (nodes currently in DFS path)
result = []
def dfs(node):
if node in in_stack:
return False # Back edge = cycle
if node in visited:
return True # Already processed
in_stack.add(node)
for neighbor in graph.get(node, []):
if not dfs(neighbor):
return False
in_stack.remove(node)
visited.add(node)
result.append(node) # Post-order: append after processing all neighbors
return True
for node in graph:
if node not in visited:
if not dfs(node):
return None # Cycle detected
return result[::-1] # Reverse post-order = topological order
# Example: Course prerequisites
graph = {
'algorithms': ['data structures'],
'data structures': ['programming basics'],
'programming basics': [],
'operating systems': ['programming basics'],
'compilers': ['algorithms', 'data structures']
}
order = topological_sort_dfs(graph)
print(order) # ['programming basics', 'data structures', 'algorithms', 'operating systems', 'compilers']
# (exact order among siblings may vary)
Approach 2: Kahn’s Algorithm (BFS / Indegree)
from collections import deque
def topological_sort_kahn(n: int, edges: list[tuple]) -> list | None:
"""
Kahn's algorithm: process nodes with indegree=0 first (no dependencies).
Returns None if a cycle exists (not all nodes get processed).
Time: O(V + E), Space: O(V)
"""
graph = [[] for _ in range(n)]
indegree = [0] * n
for u, v in edges:
graph[u].append(v)
indegree[v] += 1
# Start with all nodes that have no prerequisites
queue = deque([i for i in range(n) if indegree[i] == 0])
result = []
while queue:
node = queue.popleft()
result.append(node)
for neighbor in graph[node]:
indegree[neighbor] -= 1
if indegree[neighbor] == 0:
queue.append(neighbor)
# If result doesn't contain all nodes, there's a cycle
if len(result) != n:
return None
return result
# LeetCode 210: Course Schedule II
def find_order(num_courses: int, prerequisites: list[list[int]]) -> list[int]:
"""Return a valid course ordering, or [] if impossible (cycle)."""
edges = [(pre, course) for course, pre in prerequisites]
result = topological_sort_kahn(num_courses, edges)
return result if result is not None else []
Which Approach to Use?
| Criterion | DFS (Post-order) | Kahn’s (BFS) |
|---|---|---|
| Cycle detection | Natural (in_stack set) | Natural (check len(result) == n) |
| Parallelism hint | No | Yes — queue contains all nodes safe to process in parallel |
| Lexicographic order | Hard | Easy — use min-heap instead of deque |
| Multiple valid orderings | Returns one | Both return one; min-heap gives lexicographic smallest |
| Recursion limit concern | Yes (use iterative DFS) | No (always iterative) |
Minimum Time to Complete All Tasks (Parallel Execution)
Kahn’s algorithm reveals which tasks can be done in parallel — the nodes in the queue at any moment are all executable simultaneously:
def min_time_parallel(n: int, edges: list[tuple], time: list[int]) -> int:
"""
Find minimum time to complete all tasks when tasks can run in parallel.
time[i] = duration of task i.
Constraint: task v cannot start until task u completes (for each edge u->v).
"""
graph = [[] for _ in range(n)]
indegree = [0] * n
for u, v in edges:
graph[u].append(v)
indegree[v] += 1
# finish_time[i] = earliest time task i can complete
finish_time = list(time)
queue = deque([i for i in range(n) if indegree[i] == 0])
while queue:
node = queue.popleft()
for neighbor in graph[node]:
# Neighbor can't start until node finishes
finish_time[neighbor] = max(
finish_time[neighbor],
finish_time[node] + time[neighbor]
)
indegree[neighbor] -= 1
if indegree[neighbor] == 0:
queue.append(neighbor)
return max(finish_time)
Practice Problems
- LeetCode 207: Course Schedule (just detect cycle)
- LeetCode 210: Course Schedule II (return ordering)
- LeetCode 310: Minimum Height Trees (find topological roots)
- LeetCode 1136: Parallel Courses (minimum semesters)
- LeetCode 329: Longest Increasing Path in a Matrix (topo sort + DP)
Related Graph Topics
- Detect a Cycle in a Graph — topological sort requires a DAG; use cycle detection (DFS three-color or Kahn’s node count check) to validate before sorting
- BFS vs DFS: When to Use Each — Kahn’s algorithm is BFS; DFS post-order is the other approach; same graph traversal primitives, different application