Advanced Heap Interview Patterns

Advanced Heap Interview Patterns

Heaps appear in a surprising number of interview problems beyond the simple “find the Kth largest element.” This guide covers the recurring patterns you need to recognize and implement quickly under pressure.

Two-Heap Median (LC 295)

The classic trick: maintain a max-heap for the lower half and a min-heap for the upper half. The median lives at one or both tops. Balance the heaps so their sizes differ by at most 1.

import heapq

class MedianFinder:
    def __init__(self):
        self.lo = []  # max-heap (store negated values)
        self.hi = []  # min-heap

    def addNum(self, num: int) -> None:
        # Push to max-heap first
        heapq.heappush(self.lo, -num)
        # Balance: ensure every lo element = hi size
        if len(self.hi) > len(self.lo):
            heapq.heappush(self.lo, -heapq.heappop(self.hi))

    def findMedian(self) -> float:
        if len(self.lo) > len(self.hi):
            return -self.lo[0]
        return (-self.lo[0] + self.hi[0]) / 2.0

Complexity: O(log n) per insert, O(1) for median. The invariant to maintain: len(lo) == len(hi) or len(lo) == len(hi) + 1.

K-Way Merge with Heap (LC 23, LC 786)

Push (value, list_index, element_index) triples into a min-heap. Pop the minimum, output it, then push the next element from the same list. This is the core pattern for merging K sorted lists or finding the Kth smallest across K sorted arrays.

import heapq
from typing import List, Optional

class ListNode:
    def __init__(self, val=0, next=None):
        self.val = val
        self.next = next

def mergeKLists(lists: List[Optional[ListNode]]) -> Optional[ListNode]:
    heap = []
    for i, node in enumerate(lists):
        if node:
            heapq.heappush(heap, (node.val, i, node))

    dummy = ListNode(0)
    cur = dummy
    while heap:
        val, i, node = heapq.heappop(heap)
        cur.next = node
        cur = cur.next
        if node.next:
            heapq.heappush(heap, (node.next.val, i, node.next))
    return dummy.next

# LC 786 - Kth Smallest Prime Fraction
# Push (a/b, a_idx, b_idx) for each pair; K-way merge across denominators
def kthSmallestPrimeFraction(arr: List[int], k: int) -> List[int]:
    n = len(arr)
    heap = [(arr[0] / arr[j], 0, j) for j in range(1, n)]
    heapq.heapify(heap)
    for _ in range(k - 1):
        val, i, j = heapq.heappop(heap)
        if i + 1 < j:
            heapq.heappush(heap, (arr[i+1] / arr[j], i+1, j))
    _, i, j = heapq.heappop(heap)
    return [arr[i], arr[j]]

Complexity: O(N log K) where N is total elements and K is number of lists. The index triple avoids comparison issues when values are equal.

Top-K with Custom Comparators (LC 347, LC 973)

Use a min-heap of size K to find the top-K largest elements. When heap size exceeds K, pop the minimum – you are left with the K largest. Python’s heapq only supports min-heaps, so negate values to simulate a max-heap.

import heapq
from collections import Counter
from typing import List

# LC 347 - Top K Frequent Elements
def topKFrequent(nums: List[int], k: int) -> List[int]:
    count = Counter(nums)
    # min-heap of (frequency, element); pop least frequent when size > k
    heap = []
    for num, freq in count.items():
        heapq.heappush(heap, (freq, num))
        if len(heap) > k:
            heapq.heappop(heap)
    return [num for freq, num in heap]

# LC 973 - K Closest Points to Origin
def kClosest(points: List[List[int]], k: int) -> List[List[int]]:
    heap = []
    for x, y in points:
        dist = x*x + y*y
        # Store negative distance for max-heap behavior
        heapq.heappush(heap, (-dist, x, y))
        if len(heap) > k:
            heapq.heappop(heap)
    return [[x, y] for _, x, y in heap]

Key insight: A min-heap of size K maintains the K largest seen so far because any new element smaller than the current minimum gets rejected (it would not be in the top K).

Lazy Deletion

Python’s heapq has no efficient remove() operation. The workaround: mark elements as deleted in a separate set, then skip deleted items when popping. This is widely used in task schedulers where tasks can be cancelled after being enqueued.

import heapq

class LazyHeap:
    def __init__(self):
        self.heap = []
        self.removed = {}  # value -> count of pending removals

    def push(self, val):
        heapq.heappush(self.heap, val)

    def remove(self, val):
        self.removed[val] = self.removed.get(val, 0) + 1

    def pop(self):
        while self.heap:
            val = self.heap[0]
            if self.removed.get(val, 0) > 0:
                heapq.heappop(self.heap)
                self.removed[val] -= 1
                if self.removed[val] == 0:
                    del self.removed[val]
            else:
                return heapq.heappop(self.heap)
        raise IndexError("pop from empty heap")

    def peek(self):
        while self.heap:
            val = self.heap[0]
            if self.removed.get(val, 0) > 0:
                heapq.heappop(self.heap)
                self.removed[val] -= 1
                if self.removed[val] == 0:
                    del self.removed[val]
            else:
                return val
        raise IndexError("peek from empty heap")

Median of a Data Stream with Sliding Window

LC 480 – combine the two-heap approach with lazy deletion to handle eviction of elements that slide out of the window. When an element exits the window, mark it as deleted in whichever heap it belongs to (track via a dictionary mapping value to heap side). Rebalance after each eviction.

import heapq
from collections import defaultdict

class SlidingWindowMedian:
    def __init__(self):
        self.lo = []   # max-heap (negated)
        self.hi = []   # min-heap
        self.lo_size = 0
        self.hi_size = 0
        self.removed = defaultdict(int)

    def _balance(self):
        # ensure lo_size == hi_size or lo_size == hi_size + 1
        while self.lo_size > self.hi_size + 1:
            val = -heapq.heappop(self.lo)
            self.lo_size -= 1
            heapq.heappush(self.hi, val)
            self.hi_size += 1
            self._clean_hi()
        while self.hi_size > self.lo_size:
            val = heapq.heappop(self.hi)
            self.hi_size -= 1
            heapq.heappush(self.lo, -val)
            self.lo_size += 1
            self._clean_lo()

    def _clean_lo(self):
        while self.lo and self.removed[-self.lo[0]] > 0:
            self.removed[-self.lo[0]] -= 1
            heapq.heappop(self.lo)

    def _clean_hi(self):
        while self.hi and self.removed[self.hi[0]] > 0:
            self.removed[self.hi[0]] -= 1
            heapq.heappop(self.hi)

    def add(self, num):
        if not self.lo or num <= -self.lo[0]:
            heapq.heappush(self.lo, -num)
            self.lo_size += 1
        else:
            heapq.heappush(self.hi, num)
            self.hi_size += 1
        self._balance()

    def remove(self, num):
        self.removed[num] += 1
        if self.lo and num  self.hi_size:
            self._clean_lo()
            return float(-self.lo[0])
        self._clean_lo()
        self._clean_hi()
        return (-self.lo[0] + self.hi[0]) / 2.0

Meeting Rooms II (LC 253)

Sort meetings by start time. Use a min-heap of end times. For each new meeting, if the earliest-ending meeting has already finished, reuse that room (pop and push the new end time). Otherwise add a new room. The heap size at the end equals the minimum rooms required.

import heapq
from typing import List

def minMeetingRooms(intervals: List[List[int]]) -> int:
    if not intervals:
        return 0
    intervals.sort(key=lambda x: x[0])
    rooms = []  # min-heap of end times
    for start, end in intervals:
        if rooms and rooms[0] <= start:
            heapq.heapreplace(rooms, end)  # reuse room
        else:
            heapq.heappush(rooms, end)     # need new room
    return len(rooms)

Complexity: O(n log k) where k is the answer (number of rooms). The heap never grows larger than the number of concurrent meetings.

IPO Problem (LC 502) – Two-Heap Unlock Pattern

You have initial capital W and can complete at most K projects. Each project has a capital requirement and a profit. Use a min-heap keyed on capital to track locked projects, and a max-heap keyed on profit to track available projects. Each round: unlock all projects with capital requirement <= current capital into the max-heap; pick the highest-profit available project.

import heapq
from typing import List

def findMaximizedCapital(k: int, w: int, profits: List[int], capital: List[int]) -> int:
    locked = sorted(zip(capital, profits))  # sort by capital requirement
    available = []  # max-heap of profits (negated)
    idx = 0
    for _ in range(k):
        # Unlock all projects we can now afford
        while idx < len(locked) and locked[idx][0] <= w:
            heapq.heappush(available, -locked[idx][1])
            idx += 1
        if not available:
            break  # no affordable project
        w += -heapq.heappop(available)  # pick highest profit
    return w

This pattern – one heap to track eligibility, another to greedily pick the best available – recurs in scheduling and resource allocation problems.

Heap vs Sorted Array vs BST for Dynamic Order Statistics

StructureInsertDeleteMin/MaxKth elementBest for
Binary HeapO(log n)O(log n)*O(1)O(k log n)Priority queue, top-K, median
Sorted ArrayO(n)O(n)O(1)O(1)Static data, binary search
Balanced BSTO(log n)O(log n)O(log n)O(log n)**Dynamic order statistics
SortedList (Python)O(log n)O(log n)O(1)O(log n)When you need both heap and indexing

* Heap deletion requires lazy deletion trick unless you have a direct reference to the node.
** Requires augmented BST storing subtree sizes (Python’s sortedcontainers.SortedList handles this).

Rule of thumb: Use a heap when you only need min/max or top-K. Use a sorted list or BST when you need arbitrary rank queries or range queries. Use two heaps when you need the median of a dynamic set.

{“@context”:”https://schema.org”,”@type”:”FAQPage”,”mainEntity”:[{“@type”:”Question”,”name”:”How do you implement the Median Finder data structure (LC 295)?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Use two heaps: a max-heap (lower half of numbers) and a min-heap (upper half). Invariant: the max-heap size equals or is one more than the min-heap size. addNum: push to max-heap (negate for Python's min-heap), then rebalance by pushing the max-heap top to min-heap if the min-heap top < max-heap top, then rebalance sizes. findMedian: if sizes are equal, return average of both tops; otherwise return the max-heap top (the extra element). Python heapq stores min-heaps, so negate all values for the max-heap. This gives O(log n) addNum and O(1) findMedian. The two-heap trick works because the median is always at the boundary between the two halves.”}},{“@type”:”Question”,”name”:”How does K-way merge with a heap work?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”K-way merge combines K sorted lists into one sorted output. Initialize a min-heap with the first element from each list: push (value, list_index, element_index). Each heap entry tracks which list the value came from. While the heap is not empty: pop the minimum (val, li, ei), output val. If the list li has more elements (ei+1 < len(lists[li])), push (lists[li][ei+1], li, ei+1). Total time: O(N log K) where N is total elements and K is number of lists. This is LC 23 (Merge K Sorted Lists). The same pattern applies to LC 786 (Kth Smallest Prime Fraction) and LC 373 (Find K Pairs with Smallest Sums) with tuples representing coordinates in a virtual sorted matrix.”}},{“@type”:”Question”,”name”:”What is lazy deletion for heaps and when is it used?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Lazy deletion defers the removal of heap elements. Instead of immediately removing an element (expensive in a standard heap), mark it as deleted in a hash map. When popping from the heap, check if the top element is marked as deleted and skip it. This turns O(n) removal into O(log n) amortized. Use cases: (1) Task schedulers where tasks can be cancelled – mark cancelled tasks as deleted in the heap. (2) Sliding window median – when a window element expires, mark it as deleted and skip it when it surfaces. (3) Dijkstra with decreaseKey – instead of updating a node's distance in the heap, push a new (new_dist, node) and ignore outdated entries when popped. The key requirement: deletions must be identifiable (store the exact value/ID to delete).”}},{“@type”:”Question”,”name”:”How do you find the top K frequent elements efficiently (LC 347)?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Two approaches: (1) Min-heap of size K: count frequencies with Counter, iterate (count, element) pairs. Maintain a min-heap of size K: push each pair. If heap size > K, pop the minimum. After processing all elements, the heap contains the K most frequent. Time: O(n log K). Space: O(n + K). (2) Bucket sort: create n+1 buckets where bucket[freq] contains all elements with that frequency. Iterate buckets from highest to lowest, collecting elements until K are found. Time: O(n). Space: O(n). The min-heap of size K approach is the canonical solution. For the general "top-K by any score" problem, maintaining a min-heap of size K is O(n log K) and works for streaming data – you don't need all elements in memory at once.”}},{“@type”:”Question”,”name”:”How does the IPO problem (LC 502) use two heaps?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”IPO: given W initial capital and a list of (profit, capital) pairs, pick at most K projects sequentially to maximize capital. Greedy: among all affordable projects (capital <= current_capital), always pick the one with maximum profit. Two-heap approach: min-heap sorted by capital (to efficiently find newly affordable projects as capital grows), max-heap of profits of currently affordable projects. Algorithm: for each of K rounds: push all projects with capital <= current_capital from the min-heap to the max-heap. If max-heap is not empty: pop the maximum profit and add to current_capital. If max-heap is empty: no affordable projects, stop. Time: O(n log n + K log n). The capital min-heap acts as a sorted list of "locked" projects that unlock as capital grows.”}}]}

Meta coding interviews test heap patterns including two-heap and K-way merge. See common questions for Meta interview: heap and priority queue problems.

LinkedIn engineering interviews include heap-based streaming problems. See patterns for LinkedIn interview: heap and median data stream problems.

Airbnb coding rounds include heap-based scheduling problems. See patterns for Airbnb interview: heap and scheduling problems.

Scroll to Top