Advanced Heap Interview Patterns
Heaps appear in a surprising number of interview problems beyond the simple “find the Kth largest element.” This guide covers the recurring patterns you need to recognize and implement quickly under pressure.
Two-Heap Median (LC 295)
The classic trick: maintain a max-heap for the lower half and a min-heap for the upper half. The median lives at one or both tops. Balance the heaps so their sizes differ by at most 1.
import heapq
class MedianFinder:
def __init__(self):
self.lo = [] # max-heap (store negated values)
self.hi = [] # min-heap
def addNum(self, num: int) -> None:
# Push to max-heap first
heapq.heappush(self.lo, -num)
# Balance: ensure every lo element = hi size
if len(self.hi) > len(self.lo):
heapq.heappush(self.lo, -heapq.heappop(self.hi))
def findMedian(self) -> float:
if len(self.lo) > len(self.hi):
return -self.lo[0]
return (-self.lo[0] + self.hi[0]) / 2.0
Complexity: O(log n) per insert, O(1) for median. The invariant to maintain: len(lo) == len(hi) or len(lo) == len(hi) + 1.
K-Way Merge with Heap (LC 23, LC 786)
Push (value, list_index, element_index) triples into a min-heap. Pop the minimum, output it, then push the next element from the same list. This is the core pattern for merging K sorted lists or finding the Kth smallest across K sorted arrays.
import heapq
from typing import List, Optional
class ListNode:
def __init__(self, val=0, next=None):
self.val = val
self.next = next
def mergeKLists(lists: List[Optional[ListNode]]) -> Optional[ListNode]:
heap = []
for i, node in enumerate(lists):
if node:
heapq.heappush(heap, (node.val, i, node))
dummy = ListNode(0)
cur = dummy
while heap:
val, i, node = heapq.heappop(heap)
cur.next = node
cur = cur.next
if node.next:
heapq.heappush(heap, (node.next.val, i, node.next))
return dummy.next
# LC 786 - Kth Smallest Prime Fraction
# Push (a/b, a_idx, b_idx) for each pair; K-way merge across denominators
def kthSmallestPrimeFraction(arr: List[int], k: int) -> List[int]:
n = len(arr)
heap = [(arr[0] / arr[j], 0, j) for j in range(1, n)]
heapq.heapify(heap)
for _ in range(k - 1):
val, i, j = heapq.heappop(heap)
if i + 1 < j:
heapq.heappush(heap, (arr[i+1] / arr[j], i+1, j))
_, i, j = heapq.heappop(heap)
return [arr[i], arr[j]]
Complexity: O(N log K) where N is total elements and K is number of lists. The index triple avoids comparison issues when values are equal.
Top-K with Custom Comparators (LC 347, LC 973)
Use a min-heap of size K to find the top-K largest elements. When heap size exceeds K, pop the minimum – you are left with the K largest. Python’s heapq only supports min-heaps, so negate values to simulate a max-heap.
import heapq
from collections import Counter
from typing import List
# LC 347 - Top K Frequent Elements
def topKFrequent(nums: List[int], k: int) -> List[int]:
count = Counter(nums)
# min-heap of (frequency, element); pop least frequent when size > k
heap = []
for num, freq in count.items():
heapq.heappush(heap, (freq, num))
if len(heap) > k:
heapq.heappop(heap)
return [num for freq, num in heap]
# LC 973 - K Closest Points to Origin
def kClosest(points: List[List[int]], k: int) -> List[List[int]]:
heap = []
for x, y in points:
dist = x*x + y*y
# Store negative distance for max-heap behavior
heapq.heappush(heap, (-dist, x, y))
if len(heap) > k:
heapq.heappop(heap)
return [[x, y] for _, x, y in heap]
Key insight: A min-heap of size K maintains the K largest seen so far because any new element smaller than the current minimum gets rejected (it would not be in the top K).
Lazy Deletion
Python’s heapq has no efficient remove() operation. The workaround: mark elements as deleted in a separate set, then skip deleted items when popping. This is widely used in task schedulers where tasks can be cancelled after being enqueued.
import heapq
class LazyHeap:
def __init__(self):
self.heap = []
self.removed = {} # value -> count of pending removals
def push(self, val):
heapq.heappush(self.heap, val)
def remove(self, val):
self.removed[val] = self.removed.get(val, 0) + 1
def pop(self):
while self.heap:
val = self.heap[0]
if self.removed.get(val, 0) > 0:
heapq.heappop(self.heap)
self.removed[val] -= 1
if self.removed[val] == 0:
del self.removed[val]
else:
return heapq.heappop(self.heap)
raise IndexError("pop from empty heap")
def peek(self):
while self.heap:
val = self.heap[0]
if self.removed.get(val, 0) > 0:
heapq.heappop(self.heap)
self.removed[val] -= 1
if self.removed[val] == 0:
del self.removed[val]
else:
return val
raise IndexError("peek from empty heap")
Median of a Data Stream with Sliding Window
LC 480 – combine the two-heap approach with lazy deletion to handle eviction of elements that slide out of the window. When an element exits the window, mark it as deleted in whichever heap it belongs to (track via a dictionary mapping value to heap side). Rebalance after each eviction.
import heapq
from collections import defaultdict
class SlidingWindowMedian:
def __init__(self):
self.lo = [] # max-heap (negated)
self.hi = [] # min-heap
self.lo_size = 0
self.hi_size = 0
self.removed = defaultdict(int)
def _balance(self):
# ensure lo_size == hi_size or lo_size == hi_size + 1
while self.lo_size > self.hi_size + 1:
val = -heapq.heappop(self.lo)
self.lo_size -= 1
heapq.heappush(self.hi, val)
self.hi_size += 1
self._clean_hi()
while self.hi_size > self.lo_size:
val = heapq.heappop(self.hi)
self.hi_size -= 1
heapq.heappush(self.lo, -val)
self.lo_size += 1
self._clean_lo()
def _clean_lo(self):
while self.lo and self.removed[-self.lo[0]] > 0:
self.removed[-self.lo[0]] -= 1
heapq.heappop(self.lo)
def _clean_hi(self):
while self.hi and self.removed[self.hi[0]] > 0:
self.removed[self.hi[0]] -= 1
heapq.heappop(self.hi)
def add(self, num):
if not self.lo or num <= -self.lo[0]:
heapq.heappush(self.lo, -num)
self.lo_size += 1
else:
heapq.heappush(self.hi, num)
self.hi_size += 1
self._balance()
def remove(self, num):
self.removed[num] += 1
if self.lo and num self.hi_size:
self._clean_lo()
return float(-self.lo[0])
self._clean_lo()
self._clean_hi()
return (-self.lo[0] + self.hi[0]) / 2.0
Meeting Rooms II (LC 253)
Sort meetings by start time. Use a min-heap of end times. For each new meeting, if the earliest-ending meeting has already finished, reuse that room (pop and push the new end time). Otherwise add a new room. The heap size at the end equals the minimum rooms required.
import heapq
from typing import List
def minMeetingRooms(intervals: List[List[int]]) -> int:
if not intervals:
return 0
intervals.sort(key=lambda x: x[0])
rooms = [] # min-heap of end times
for start, end in intervals:
if rooms and rooms[0] <= start:
heapq.heapreplace(rooms, end) # reuse room
else:
heapq.heappush(rooms, end) # need new room
return len(rooms)
Complexity: O(n log k) where k is the answer (number of rooms). The heap never grows larger than the number of concurrent meetings.
IPO Problem (LC 502) – Two-Heap Unlock Pattern
You have initial capital W and can complete at most K projects. Each project has a capital requirement and a profit. Use a min-heap keyed on capital to track locked projects, and a max-heap keyed on profit to track available projects. Each round: unlock all projects with capital requirement <= current capital into the max-heap; pick the highest-profit available project.
import heapq
from typing import List
def findMaximizedCapital(k: int, w: int, profits: List[int], capital: List[int]) -> int:
locked = sorted(zip(capital, profits)) # sort by capital requirement
available = [] # max-heap of profits (negated)
idx = 0
for _ in range(k):
# Unlock all projects we can now afford
while idx < len(locked) and locked[idx][0] <= w:
heapq.heappush(available, -locked[idx][1])
idx += 1
if not available:
break # no affordable project
w += -heapq.heappop(available) # pick highest profit
return w
This pattern – one heap to track eligibility, another to greedily pick the best available – recurs in scheduling and resource allocation problems.
Heap vs Sorted Array vs BST for Dynamic Order Statistics
| Structure | Insert | Delete | Min/Max | Kth element | Best for |
|---|---|---|---|---|---|
| Binary Heap | O(log n) | O(log n)* | O(1) | O(k log n) | Priority queue, top-K, median |
| Sorted Array | O(n) | O(n) | O(1) | O(1) | Static data, binary search |
| Balanced BST | O(log n) | O(log n) | O(log n) | O(log n)** | Dynamic order statistics |
| SortedList (Python) | O(log n) | O(log n) | O(1) | O(log n) | When you need both heap and indexing |
* Heap deletion requires lazy deletion trick unless you have a direct reference to the node.
** Requires augmented BST storing subtree sizes (Python’s sortedcontainers.SortedList handles this).
Rule of thumb: Use a heap when you only need min/max or top-K. Use a sorted list or BST when you need arbitrary rank queries or range queries. Use two heaps when you need the median of a dynamic set.
Meta coding interviews test heap patterns including two-heap and K-way merge. See common questions for Meta interview: heap and priority queue problems.
LinkedIn engineering interviews include heap-based streaming problems. See patterns for LinkedIn interview: heap and median data stream problems.
Airbnb coding rounds include heap-based scheduling problems. See patterns for Airbnb interview: heap and scheduling problems.