Topological Sort Interview Patterns: Kahn’s Algorithm, DFS Ordering, Cycle Detection

What Is Topological Sort?

A topological sort of a directed acyclic graph (DAG) produces a linear ordering of vertices such that for every directed edge u → v, u appears before v in the ordering. It is only possible when the graph has no cycles (hence “acyclic”). Applications: course scheduling (take prerequisites before a course), build system dependency resolution (compile dependencies before dependents), task scheduling, and package manager dependency ordering (npm, pip).

Kahn’s Algorithm (BFS-based)

Kahn’s algorithm uses in-degree (number of incoming edges) to iteratively remove nodes with no dependencies:

  1. Compute in-degree for each node.
  2. Add all nodes with in-degree = 0 to a queue (no prerequisites).
  3. While queue is not empty: dequeue a node, add to result. For each neighbor of the dequeued node: decrement their in-degree. If any neighbor’s in-degree becomes 0, enqueue it.
  4. If result length equals total nodes: valid topological order. Else: the graph has a cycle.
from collections import deque, defaultdict

def topological_sort(n, edges):
    graph = defaultdict(list)
    in_degree = [0] * n
    for u, v in edges:
        graph[u].append(v)
        in_degree[v] += 1
    queue = deque(i for i in range(n) if in_degree[i] == 0)
    result = []
    while queue:
        node = queue.popleft()
        result.append(node)
        for neighbor in graph[node]:
            in_degree[neighbor] -= 1
            if in_degree[neighbor] == 0:
                queue.append(neighbor)
    return result if len(result) == n else []  # empty = cycle detected

Time O(V + E), space O(V + E).

DFS-based Topological Sort

Run DFS on the graph. A node is added to the result stack after all its descendants are visited (post-order). The final result is the reversed stack.

def dfs_topo(n, graph):
    visited = [0] * n  # 0=unvisited, 1=in-progress, 2=done
    result = []
    has_cycle = [False]

    def dfs(node):
        if visited[node] == 1:
            has_cycle[0] = True; return
        if visited[node] == 2: return
        visited[node] = 1
        for neighbor in graph[node]:
            dfs(neighbor)
        visited[node] = 2
        result.append(node)

    for i in range(n):
        if visited[i] == 0:
            dfs(i)
    return [] if has_cycle[0] else result[::-1]

Three states: 0 (not visited), 1 (currently in DFS stack = in progress), 2 (fully explored). A cycle is detected when a node in state 1 is visited again (a back edge).

Classic Interview Problems

Course Schedule (LC 207): given n courses and prerequisites, determine if all courses can be finished (i.e., no cycle in the prerequisite graph). Build directed graph from prerequisites and run Kahn’s — return True if topological sort length = n.

Course Schedule II (LC 210): return the actual order to take courses. Return the topological sort result from Kahn’s (or DFS).

Alien Dictionary (LC 269): given a list of words sorted in an alien language, find the character ordering. For each adjacent pair of words, find the first differing character — that gives a directed edge (char_in_word1 → char_in_word2). Run topological sort on the resulting graph. If a cycle exists, the ordering is invalid. Edge case: if word1 is longer than word2 and word2 is a prefix of word1, the ordering is invalid (e.g., [“abc”, “ab”] is contradictory).

Parallel Courses III (LC 2050): find the minimum number of semesters to complete all courses, where each semester you can take any courses whose prerequisites are complete. This is the longest path in the DAG. Kahn’s with a twist: process layer by layer (like BFS levels). Each layer = one semester. Track the earliest semester each course can be taken.

Find All Possible Recipes (LC 2115): given recipes with ingredients (some of which are other recipes), find which recipes can be made. Model as a directed graph: ingredient → recipe that needs it. Apply topological sort. A recipe is creatable if all its ingredient nodes have been processed (either raw ingredients or creatable recipes).

Cycle Detection in Directed Graphs

Undirected graph cycle detection: union-find or DFS with parent tracking. Directed graph cycle detection (harder): must detect back edges (edges to ancestors in the current DFS path). Three-color DFS (states 0/1/2 as above) correctly identifies directed cycles. Alternative: Kahn’s algorithm — if topological sort doesn’t include all nodes, a cycle exists among the excluded nodes.

Strongly Connected Components (SCCs)

An SCC is a maximal subgraph where every node is reachable from every other. Kosaraju’s algorithm: (1) Run DFS on original graph, push nodes to a stack in post-order (finish order). (2) Transpose the graph (reverse all edges). (3) Pop from stack and run DFS on transposed graph — each DFS traversal from an unvisited node finds one SCC. Time O(V + E). SCCs partition a directed graph — condensation DAG of SCCs is always a DAG, suitable for topological sort.

Interview Patterns

  • “Can all X be completed/ordered?” → topological sort + cycle detection
  • “What order should X be done?” → topological sort (return the order)
  • “How many steps/levels until all done?” → BFS topological sort, count levels
  • “Minimum time if parallel processing allowed?” → longest path in DAG using topological sort order
  • Detecting circular imports/dependencies → directed graph cycle detection

{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “How does Kahn’s algorithm detect a cycle in a directed graph?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Kahn’s algorithm detects cycles by comparing the topological sort result length with the total number of nodes. Kahn’s works by iteratively removing nodes with in-degree 0 (no incoming edges). After the algorithm completes, if result length < n (number of nodes), some nodes were never added to the result u2014 these nodes have at least one incoming edge remaining, meaning they are part of a cycle. Nodes in a cycle always have in-degree u2265 1 from other nodes in the cycle; none of them ever reaches in-degree 0, so none are ever added to the queue or result. Example: a 4-node graph where nodes 0 u2192 1 u2192 2 u2192 0 form a cycle, and node 3 has no edges. Kahn's processes node 3 (in-degree 0), adds it to the result (length=1). Nodes 0, 1, 2 never have in-degree 0 (each has one incoming edge from the cycle). Result length = 1 u2260 4 u2192 cycle detected. This approach is simpler than DFS cycle detection (no three-color state needed) and also produces the topological order as a byproduct."
}
},
{
"@type": "Question",
"name": "What is the difference between topological sort and DFS post-order traversal?",
"acceptedAnswer": {
"@type": "Answer",
"text": "Topological sort via DFS uses post-order traversal: a node is added to the result after all of its descendants have been fully explored. This ensures that every node appears in the result only after all nodes it depends on (its successors in the graph). The result is the reversed post-order stack. Relationship to DFS: in standard DFS, we visit children recursively. The post-order completion time of a node (when it's added to the stack) always comes after all nodes reachable from it. Reversing this order gives: nodes with no outgoing edges (leaf nodes / no dependencies) appear last in DFS post-order, so they appear first in the reversed result u2014 which is correct for topological ordering (sinks come last, not first). Key difference from BFS topological sort (Kahn's): DFS naturally handles disconnected components (the outer loop ensures all unvisited nodes are explored). DFS also detects cycles via the three-color state (0=unvisited, 1=in-progress, 2=done): if you visit a node with state=1, you've found a back edge u2192 cycle. Kahn's detects cycles via incomplete result length. Both give valid topological orderings, but the specific ordering may differ."
}
},
{
"@type": "Question",
"name": "How do you find the minimum time to finish all courses when courses can be taken in parallel?",
"acceptedAnswer": {
"@type": "Answer",
"text": "This is the "minimum number of semesters" or "minimum time with parallelism" problem (LC 2050 variant). Model it as finding the longest path in the DAG of course prerequisites u2014 the critical path determines the minimum time even with unlimited parallelism. Using Kahn's BFS: process courses level by level. Level 0 = all courses with no prerequisites (can start immediately). Level 1 = all courses whose only prerequisites are in level 0. Level 2 = all courses whose prerequisites are all in level 0 or 1. Each level represents one time unit (semester). Track the "earliest semester" each course can be taken: when adding a course to the queue (in-degree reaches 0), set earliest_semester[course] = max(earliest_semester[all prerequisites]) + 1. The answer is max(earliest_semester). For the general case with time costs per task: dp[node] = cost[node] + max(dp[prerequisite] for all prerequisites). Process in topological order. This is equivalent to finding the critical path in a project management context (CPM u2014 Critical Path Method), where the minimum project duration = longest path through the dependency graph."
}
}
]
}

  • Snap Interview Guide
  • Databricks Interview Guide
  • Coinbase Interview Guide
  • Atlassian Interview Guide
  • LinkedIn Interview Guide
  • Uber Interview Guide
  • Apple Interview Guide
  • Companies That Ask This

    Scroll to Top