The Two Heaps Pattern
Two heaps maintain the lower and upper halves of a data stream, allowing O(1) access to the median and O(log n) insertion. Use when: “find the median of a data stream,” “sliding window median,” or any problem requiring the middle value of a dynamically changing dataset.
Structure: a max-heap (left) holds the smaller half; a min-heap (right) holds the larger half. Invariants: (1) every element in left ≤ every element in right. (2) |len(left) – len(right)| ≤ 1. Median: if sizes are equal → (max(left) + min(right)) / 2; if left is larger → max(left).
Find Median from Data Stream (LC 295)
import heapq
class MedianFinder:
def __init__(self):
self.small = [] # max-heap (negate values)
self.large = [] # min-heap
def addNum(self, num):
# Always push to small first, then balance
heapq.heappush(self.small, -num)
# Ensure all of small self.large[0]):
heapq.heappush(self.large, -heapq.heappop(self.small))
# Balance sizes: small can be at most 1 larger than large
if len(self.small) > len(self.large) + 1:
heapq.heappush(self.large, -heapq.heappop(self.small))
if len(self.large) > len(self.small):
heapq.heappush(self.small, -heapq.heappop(self.large))
def findMedian(self):
if len(self.small) > len(self.large):
return -self.small[0]
return (-self.small[0] + self.large[0]) / 2
Sliding Window Median (LC 480)
Same two-heap approach, but remove elements as they leave the window. Python’s heapq doesn’t support O(log n) deletion, so use a lazy deletion map: mark deleted elements, skip them when they reach the heap top.
from collections import defaultdict
import heapq
def medianSlidingWindow(nums, k):
small, large = [], [] # max-heap (negated), min-heap
to_remove = defaultdict(int)
def add(num):
if not small or num <= -small[0]:
heapq.heappush(small, -num)
else:
heapq.heappush(large, num)
balance()
def remove(num):
to_remove[num] += 1
if num len(large) + 1:
heapq.heappush(large, -heapq.heappop(small))
clean(large, False)
elif len(large) > len(small):
heapq.heappush(small, -heapq.heappop(large))
clean(small, True)
def get_median():
if len(small) > len(large):
return float(-small[0])
return (-small[0] + large[0]) / 2.0
for num in nums[:k]:
add(num)
result = [get_median()]
for i in range(k, len(nums)):
add(nums[i])
remove(nums[i - k])
clean(small, True); clean(large, False)
result.append(get_median())
return result
IPO (LC 502) — Two Heaps for Greedy Selection
Given projects with capital requirements and profits, select at most k projects to maximize total capital. Greedy: always pick the most profitable project you can currently afford. Two heaps: (1) min-heap of (capital, profit) for unavailable projects. (2) max-heap of profits for available projects. At each step: move all projects with capital ≤ current_capital from the min-heap to the max-heap. Pick the max profit from the max-heap.
def findMaximizedCapital(k, w, profits, capital):
available = [] # max-heap of profits (negate)
unavailable = sorted(zip(capital, profits)) # sort by capital
idx = 0
for _ in range(k):
while idx < len(unavailable) and unavailable[idx][0] <= w:
heapq.heappush(available, -unavailable[idx][1])
idx += 1
if not available: break
w += -heapq.heappop(available)
return w
When to Use Two Heaps
- Median of a data stream — O(log n) insert, O(1) median
- Sliding window median — two heaps + lazy deletion
- Problems requiring the “middle” of a sorted set — the k-th largest/smallest of a dynamic stream
- Greedy problems where you need both the minimum of one group and the maximum of another simultaneously (e.g., meeting rooms, task scheduling with priority)
{“@context”:”https://schema.org”,”@type”:”FAQPage”,”mainEntity”:[{“@type”:”Question”,”name”:”What are the invariants of the two-heap pattern for median finding?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”The two-heap pattern maintains two invariants: (1) Ordering — every element in the max-heap (left, smaller half) must be <= every element in the min-heap (right, larger half). This ensures the heaps hold the true lower and upper halves. (2) Size balance — the max-heap can have at most 1 more element than the min-heap (|left| – |right| <= 1). This ensures the median is always accessible in O(1). The median is: if both heaps have equal size, (max(left) + min(right)) / 2; if left is 1 larger, max(left). On each insertion: always push to left first, then check the ordering invariant (pop from left to right if violated), then check the size invariant (move one element to re-balance). Both rebalancing steps are O(log n).”}},{“@type”:”Question”,”name”:”How does lazy deletion work for sliding window median (LC 480)?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Python's heapq doesn't support O(log n) deletion of arbitrary elements. Lazy deletion: instead of deleting from the heap immediately, mark the element as "to be deleted" in a counter (defaultdict). When popping from a heap (to get the top element), check if the top is in the deletion counter. If yes, skip it (decrement counter, pop and repeat) — this is the "lazy" part. For the sliding window: when the window slides out (removing nums[i-k]), record it in to_remove[nums[i-k]] += 1. When computing the median, the clean() function removes deleted elements from the heap tops. This is amortized O(log n) because each element is pushed and lazily deleted at most once. Total time O(n log k).”}},{“@type”:”Question”,”name”:”What is the IPO problem (LC 502) and how do two heaps solve it?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”IPO: given N projects with capital requirements and profits, and initial capital W, select at most k projects to maximize total capital. Greedy insight: always pick the most profitable project you can currently afford. Two heaps solve this efficiently: (1) Sort all projects by capital requirement. (2) Use a pointer to iterate through projects in order of capital requirement. (3) Maintain a max-heap of profits for available projects (capital <= current W). At each step: advance the pointer, adding all newly affordable projects to the max-heap. Pick the maximum profit from the max-heap, add it to W. Repeat k times. Time O(n log n + k log n). The min-heap for unavailable projects (sorted by capital) ensures we discover newly affordable projects in O(log n) as W increases.”}},{“@type”:”Question”,”name”:”When should you use two heaps instead of a sorted list or segment tree?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Two heaps give O(log n) insert and O(1) median query. Use them when: you only need the median (or the boundary between two halves), not arbitrary rank queries. Sorted list (SortedList from sortedcontainers): O(log n) insert, O(1) index access. Use when you need the k-th element for arbitrary k, not just the median. Segment tree with coordinate compression: O(log n) insert, O(log n) for any rank query. Use when k varies per query. In interviews, two heaps are preferred for median problems because they are simpler to implement from scratch and the invariants are clear. SortedList requires knowing the sortedcontainers library. For sliding window median in production code, SortedList is cleaner. For interview whiteboard, two heaps with lazy deletion is the expected answer.”}},{“@type”:”Question”,”name”:”How do you handle an even vs odd number of elements in the two-heap median?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”With the size invariant that left (max-heap) can be at most 1 larger than right (min-heap): if len(left) == len(right): both halves are equal size — median = (max(left) + min(right)) / 2.0. Return as float for even-count medians. if len(left) == len(right) + 1: left has one extra — median = max(left). Return as int or float as needed. Never let right be larger than left: if right grows larger, move its minimum to left immediately. This is the re-balancing step. Example: {1, 2, 3, 4, 5}: left=[3,2,1], right=[4,5] → left has 3 elements, right has 2 → median = max(left) = 3. For {1,2,3,4}: left=[2,1], right=[3,4] → sizes equal → median = (2+3)/2 = 2.5.”}}]}
Google coding interviews test heap and priority queue patterns. See common questions for Google interview: heap and priority queue problems.
Meta coding interviews test heap and median problems. Review patterns for Meta interview: median and heap data structure problems.
Amazon coding interviews cover heap and data stream problems. See patterns for Amazon interview: heap and median stream problems.