Coding Interview: Topological Sort Deep Dive — Kahn Algorithm, DFS, Course Schedule, Alien Dictionary, Build Order

Topological sort orders vertices in a directed acyclic graph (DAG) such that for every edge (u, v), u appears before v. It models dependency ordering: course prerequisites, build systems, task scheduling, and package installation. Topological sort appears in approximately 5-8% of graph problems at top tech companies. This guide covers both algorithms (Kahn BFS and DFS), cycle detection, and the key interview problems.

Kahn Algorithm (BFS-Based)

Kahn algorithm uses in-degree counting and BFS. Steps: (1) Compute the in-degree (number of incoming edges) for every node. (2) Add all nodes with in-degree 0 to a queue (they have no dependencies). (3) Process the queue: dequeue a node, add it to the result. For each neighbor of the dequeued node, decrement its in-degree. If in-degree reaches 0, add it to the queue. (4) Repeat until the queue is empty. If the result contains fewer than V nodes, the graph has a cycle (some nodes could never reach in-degree 0). Time: O(V + E). Space: O(V) for the in-degree array and queue. Why it works: a node with in-degree 0 has all its dependencies satisfied. Processing it and decrementing neighbor in-degrees simulates “completing” the task. Neighbors whose in-degree reaches 0 now have all their dependencies satisfied and can be processed. The algorithm processes nodes layer by layer, like peeling an onion from the outside.

DFS-Based Topological Sort

DFS approach: perform DFS on the graph. When a node has no more unvisited neighbors (post-order), push it onto a stack. The stack, read from top to bottom, gives the topological order. Steps: (1) For each unvisited node, run DFS. (2) In DFS: mark the node as “visiting” (in progress). Recurse on all unvisited neighbors. After all neighbors are processed, mark as “visited” and push to the result stack. (3) The result stack (reversed) is the topological order. Cycle detection: if during DFS you encounter a node that is “visiting” (in the current recursion stack), you have found a back edge — the graph has a cycle. No valid topological order exists. Three-state DFS: unvisited (white), visiting (gray, in current DFS path), visited (black, fully processed). A gray-to-gray edge is a cycle. Time: O(V + E). Space: O(V) for the visited set, recursion stack, and result. Kahn vs DFS: both produce valid topological orders (there may be multiple valid orders). Kahn is iterative (no recursion stack overflow risk). DFS is recursive (more natural for some problems). Kahn naturally detects cycles (result size < V). DFS detects cycles via back edges.

Course Schedule Problems

Course Schedule I: given N courses and prerequisites, determine if all courses can be finished (no circular dependency). This is cycle detection in a directed graph. Build the graph from prerequisites. Run Kahn algorithm. If the result has fewer than N courses, there is a cycle (impossible). Or run DFS with three-state cycle detection. Course Schedule II: return a valid course order (any topological sort). Run Kahn algorithm and return the result. If result length < N, return empty (impossible). Course Schedule III: given courses with deadlines and durations, find the maximum number of courses you can take. This is a greedy + heap problem (not topological sort): sort by deadline, use a max-heap of durations. If adding a course exceeds its deadline, replace the longest course in the heap if the current course is shorter. Parallel courses: find the minimum number of semesters to complete all courses (you can take multiple courses per semester if dependencies are satisfied). This is the longest path in the DAG — the number of layers in the topological sort (BFS-level count). Kahn algorithm naturally provides this: each BFS level is one semester.

Alien Dictionary

Given a sorted list of words in an alien language, determine the character ordering. This is topological sort on character precedence. Algorithm: (1) Compare adjacent words to extract ordering constraints. For words “wrt” and “wrf”: compare character by character until a difference: t != f, so t comes before f. Add edge t -> f. (2) Build a directed graph from all extracted edges. (3) Topological sort the graph. The result is the character order. Edge cases: if a longer word appears before its prefix (“abc” before “ab”), the ordering is invalid (return empty). If the graph has a cycle, the ordering is invalid. If multiple valid orders exist (disconnected components), return any valid one. This problem combines string comparison with graph construction and topological sort — a multi-step problem that tests both string manipulation and graph algorithm skills.

Build Order and Task Scheduling

Build order: given a list of projects and dependencies, find a valid build order. Direct application of topological sort. If a cycle exists, no valid build order exists. Task scheduling with parallel execution: given tasks with dependencies and execution times, find the minimum completion time. This is the critical path: the longest path in the DAG weighted by task duration. Algorithm: topological sort. For each node in topological order: earliest_start[node] = max(earliest_finish[predecessor]) for all predecessors. earliest_finish[node] = earliest_start[node] + duration[node]. The minimum completion time is max(earliest_finish) across all nodes. Compilation order: a compiler must compile source files in dependency order. Files with no dependencies are compiled first (in parallel). Files whose dependencies are all compiled are compiled next. This is exactly Kahn algorithm: each BFS level represents files that can be compiled in parallel. The number of levels is the minimum number of compilation rounds. Pattern recognition: whenever you see “order items respecting dependencies,” “find if an ordering is possible,” or “minimum steps to complete dependent tasks,” think topological sort. The graph is built from the dependency relationships.

{“@context”:”https://schema.org”,”@type”:”FAQPage”,”mainEntity”:[{“@type”:”Question”,”name”:”How does Kahn algorithm perform topological sort?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Kahn algorithm uses in-degree counting and BFS: (1) Compute in-degree (incoming edge count) for every node. (2) Add all nodes with in-degree 0 to a queue (no unmet dependencies). (3) Dequeue a node, add to result. For each of its neighbors: decrement their in-degree. If in-degree reaches 0, add to queue. (4) Repeat until queue is empty. If result has fewer than V nodes, the graph has a cycle. Time: O(V+E). Why it works: in-degree 0 means all dependencies are satisfied. Processing a node and decrementing neighbor in-degrees simulates completing that task. The algorithm peels layers from the graph — each layer is nodes whose dependencies are all in previous layers. Cycle detection is built in: if a cycle exists, some nodes never reach in-degree 0 and are never added to the result.”}},{“@type”:”Question”,”name”:”How do you solve the Alien Dictionary problem with topological sort?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Given a sorted word list in an alien language, determine character ordering. Steps: (1) Compare adjacent words to extract precedence constraints. For words wrt and wrf: first differing character is t vs f, so t comes before f in the alien alphabet. Add edge t->f. (2) Build a directed graph from all extracted edges. (3) Run topological sort on the graph. The result is the character order. Edge cases: if a longer word appears before its prefix (abc before ab), the input is invalid. If the graph has a cycle, no valid ordering exists. If multiple valid orders exist (disconnected components), return any. This problem combines string comparison (extracting edges) with graph construction and topological sort — testing multiple skills in one question.”}},{“@type”:”Question”,”name”:”How do you detect cycles during topological sort?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Two approaches: Kahn algorithm: after processing, if the result contains fewer than V nodes, the graph has a cycle. Nodes in the cycle never reach in-degree 0, so they are never processed. Simple: just check result.length < V. DFS three-state approach: mark nodes as unvisited (white), visiting (gray, in current DFS path), and visited (black, fully processed). During DFS: when entering a node, mark gray. When all neighbors are processed, mark black. If you encounter a gray node during DFS (a node already in the current path), you found a back edge — the graph has a cycle. This is more explicit than Kahn for identifying WHICH nodes form the cycle (follow the gray chain). Choose Kahn when you need the topological order AND cycle detection. Choose DFS when you need to identify the specific cycle or when the recursive structure is more natural for the problem."}},{"@type":"Question","name":"What problems are solved by topological sort in coding interviews?","acceptedAnswer":{"@type":"Answer","text":"Core problems: (1) Course Schedule I: can all courses be taken? = Is the graph a DAG? (cycle detection). (2) Course Schedule II: find a valid order = topological sort. (3) Alien Dictionary: extract character precedence from sorted words, topological sort. (4) Build Order: order projects respecting dependencies. (5) Parallel Courses: minimum semesters = longest path in DAG = number of BFS levels in Kahn algorithm. (6) Compile Order: order source files by #include dependencies. Recognition signal: whenever a problem says order items respecting dependencies, find if a valid ordering exists, or minimum steps for dependent tasks — think topological sort. Build the directed graph from the dependency relationships. If the graph might have cycles, detect and handle them. If parallel execution is asked, the answer is the critical path (longest chain of dependencies)."}}]}
Scroll to Top