Coding Interview: Topological Sort Deep Dive — Kahn Algorithm, DFS, Course Schedule, Alien Dictionary, Build Order

Topological sort orders vertices in a directed acyclic graph (DAG) such that for every edge (u, v), u appears before v. It models dependency ordering: course prerequisites, build systems, task scheduling, and package installation. Topological sort appears in approximately 5-8% of graph problems at top tech companies. This guide covers both algorithms (Kahn BFS and DFS), cycle detection, and the key interview problems.

Kahn Algorithm (BFS-Based)

Kahn algorithm uses in-degree counting and BFS. Steps: (1) Compute the in-degree (number of incoming edges) for every node. (2) Add all nodes with in-degree 0 to a queue (they have no dependencies). (3) Process the queue: dequeue a node, add it to the result. For each neighbor of the dequeued node, decrement its in-degree. If in-degree reaches 0, add it to the queue. (4) Repeat until the queue is empty. If the result contains fewer than V nodes, the graph has a cycle (some nodes could never reach in-degree 0). Time: O(V + E). Space: O(V) for the in-degree array and queue. Why it works: a node with in-degree 0 has all its dependencies satisfied. Processing it and decrementing neighbor in-degrees simulates “completing” the task. Neighbors whose in-degree reaches 0 now have all their dependencies satisfied and can be processed. The algorithm processes nodes layer by layer, like peeling an onion from the outside.

DFS-Based Topological Sort

DFS approach: perform DFS on the graph. When a node has no more unvisited neighbors (post-order), push it onto a stack. The stack, read from top to bottom, gives the topological order. Steps: (1) For each unvisited node, run DFS. (2) In DFS: mark the node as “visiting” (in progress). Recurse on all unvisited neighbors. After all neighbors are processed, mark as “visited” and push to the result stack. (3) The result stack (reversed) is the topological order. Cycle detection: if during DFS you encounter a node that is “visiting” (in the current recursion stack), you have found a back edge — the graph has a cycle. No valid topological order exists. Three-state DFS: unvisited (white), visiting (gray, in current DFS path), visited (black, fully processed). A gray-to-gray edge is a cycle. Time: O(V + E). Space: O(V) for the visited set, recursion stack, and result. Kahn vs DFS: both produce valid topological orders (there may be multiple valid orders). Kahn is iterative (no recursion stack overflow risk). DFS is recursive (more natural for some problems). Kahn naturally detects cycles (result size < V). DFS detects cycles via back edges.

Course Schedule Problems

Course Schedule I: given N courses and prerequisites, determine if all courses can be finished (no circular dependency). This is cycle detection in a directed graph. Build the graph from prerequisites. Run Kahn algorithm. If the result has fewer than N courses, there is a cycle (impossible). Or run DFS with three-state cycle detection. Course Schedule II: return a valid course order (any topological sort). Run Kahn algorithm and return the result. If result length < N, return empty (impossible). Course Schedule III: given courses with deadlines and durations, find the maximum number of courses you can take. This is a greedy + heap problem (not topological sort): sort by deadline, use a max-heap of durations. If adding a course exceeds its deadline, replace the longest course in the heap if the current course is shorter. Parallel courses: find the minimum number of semesters to complete all courses (you can take multiple courses per semester if dependencies are satisfied). This is the longest path in the DAG — the number of layers in the topological sort (BFS-level count). Kahn algorithm naturally provides this: each BFS level is one semester.

Alien Dictionary

Given a sorted list of words in an alien language, determine the character ordering. This is topological sort on character precedence. Algorithm: (1) Compare adjacent words to extract ordering constraints. For words “wrt” and “wrf”: compare character by character until a difference: t != f, so t comes before f. Add edge t -> f. (2) Build a directed graph from all extracted edges. (3) Topological sort the graph. The result is the character order. Edge cases: if a longer word appears before its prefix (“abc” before “ab”), the ordering is invalid (return empty). If the graph has a cycle, the ordering is invalid. If multiple valid orders exist (disconnected components), return any valid one. This problem combines string comparison with graph construction and topological sort — a multi-step problem that tests both string manipulation and graph algorithm skills.

Build Order and Task Scheduling

Build order: given a list of projects and dependencies, find a valid build order. Direct application of topological sort. If a cycle exists, no valid build order exists. Task scheduling with parallel execution: given tasks with dependencies and execution times, find the minimum completion time. This is the critical path: the longest path in the DAG weighted by task duration. Algorithm: topological sort. For each node in topological order: earliest_start[node] = max(earliest_finish[predecessor]) for all predecessors. earliest_finish[node] = earliest_start[node] + duration[node]. The minimum completion time is max(earliest_finish) across all nodes. Compilation order: a compiler must compile source files in dependency order. Files with no dependencies are compiled first (in parallel). Files whose dependencies are all compiled are compiled next. This is exactly Kahn algorithm: each BFS level represents files that can be compiled in parallel. The number of levels is the minimum number of compilation rounds. Pattern recognition: whenever you see “order items respecting dependencies,” “find if an ordering is possible,” or “minimum steps to complete dependent tasks,” think topological sort. The graph is built from the dependency relationships.

Scroll to Top