Median of Two Sorted Arrays: Binary Search’s Nastiest Cousin

⏱ 7 min read

Given two sorted arrays of sizes m and n, find the median of the combined sorted array in O(log(m+n)) time.

This is LeetCode problem 4, and it has the reputation of being binary search’s nastiest cousin. The problem is short, the constraints are clean, the answer is a single number — and yet getting from the obvious O(m+n) merge to the required O(log) solution requires a binary search that almost no one writes correctly on the first try, even with the algorithm in front of them. The off-by-ones are vicious, the partition logic is delicate, and the case analysis for parity (combined length even vs odd) doubles the surface area for bugs.

The brute force: O(m+n)

def median_brute(nums1, nums2):
    merged = sorted(nums1 + nums2)
    n = len(merged)
    if n % 2:
        return merged[n // 2]
    return (merged[n // 2 - 1] + merged[n // 2]) / 2

O((m+n) log (m+n)) because of the sort. Trivially correct. Most candidates produce this in the first 30 seconds.

The merge improvement: O(m+n)

def median_merge(nums1, nums2):
    i = j = 0
    merged = []
    while i < len(nums1) and j < len(nums2):
        if nums1[i] <= nums2[j]:
            merged.append(nums1[i]); i += 1
        else:
            merged.append(nums2[j]); j += 1
    merged.extend(nums1[i:])
    merged.extend(nums2[j:])
    n = len(merged)
    if n % 2:
        return merged[n // 2]
    return (merged[n // 2 - 1] + merged[n // 2]) / 2

Standard merge of two sorted arrays. O(m+n) time and space. We do not actually need the full merged array; we only need the median, so we can stop after merging the first (m+n)/2 + 1 elements:

def median_merge_optimal(nums1, nums2):
    total = len(nums1) + len(nums2)
    target = total // 2
    i = j = 0
    prev = curr = 0
    for _ in range(target + 1):
        prev = curr
        if i < len(nums1) and (j >= len(nums2) or nums1[i] <= nums2[j]):
            curr = nums1[i]; i += 1
        else:
            curr = nums2[j]; j += 1
    return curr if total % 2 else (prev + curr) / 2

Same O(m+n) time but O(1) space. This is what most candidates produce in 5–10 minutes. Good enough at most companies. At FAANG-level interviews, the follow-up is “can you do better than O(m+n)?”, and the candidate is expected to know there exists an O(log) solution.

The binary search: O(log(min(m,n)))

The insight: we are looking for a partition of both arrays such that the left half (combined) is exactly the smaller half of the combined sorted result. Specifically, if we split nums1 at index i and nums2 at index j, with the constraint i + j = (m + n + 1) / 2, then the left side has the right number of elements. The partition is correct if every element on the left is ≤ every element on the right — which simplifies to checking nums1[i-1] ≤ nums2[j] and nums2[j-1] ≤ nums1[i].

We binary-search over i in the smaller array. For each candidate i, compute j from the constraint, then check the partition condition.

def median_binary_search(nums1, nums2):
    if len(nums1) > len(nums2):
        nums1, nums2 = nums2, nums1
    m, n = len(nums1), len(nums2)
    total = m + n
    half = (total + 1) // 2

    lo, hi = 0, m
    while lo <= hi:
        i = (lo + hi) // 2
        j = half - i

        left1  = nums1[i-1] if i > 0 else float('-inf')
        right1 = nums1[i]   if i < m else float('inf')
        left2  = nums2[j-1] if j > 0 else float('-inf')
        right2 = nums2[j]   if j < n else float('inf')

        if left1 <= right2 and left2 <= right1:
            if total % 2:
                return max(left1, left2)
            return (max(left1, left2) + min(right1, right2)) / 2
        elif left1 > right2:
            hi = i - 1
        else:
            lo = i + 1

O(log(min(m,n))) time, O(1) space. The algorithm relies on three tricks: ensure nums1 is the smaller array (so the binary search range is small), use ±infinity sentinels at array boundaries, and use (total + 1) // 2 for the half size so even and odd total lengths can share the same partition logic.

Why this is “the nastiest cousin”

Binary search is conceptually simple — split the search space in half, recurse on the half that contains the answer. Most binary search problems involve searching for a target in a sorted structure, where the comparison is straightforward. Median of two sorted arrays inverts the structure: we are not searching for a specific value; we are searching for a partition. The “comparison” (am I too far left or too far right?) requires reasoning about elements at four different positions across two arrays, with three boundary cases each, with even-vs-odd total length handling layered on top.

The result is a problem where the algorithm fits in 20 lines but every line has a subtle correctness condition that is easy to get wrong. The most common bugs in candidate code:

Wrong array as the binary-searched one. If nums1 is the larger array and you binary search over its index, your j can go negative or exceed n. Always swap so nums1 is the smaller array first.
Off-by-one in the half size. (m+n)/2 versus (m+n+1)/2 matters. Use (m+n+1)//2 for the integer half so that the left partition has the median (for odd total) or the larger element below the median (for even total).
Boundary handling for empty partitions. When i=0, there is no left element in nums1; treat it as -∞. When i=m, there is no right element; treat as +∞. Sentinels are cleaner than special cases.
Even-versus-odd handling at the end. For odd total, return max(left1, left2). For even total, return the average of max(left1, left2) and min(right1, right2). Forgetting the latter case gives wrong answers half the time.

What interviewers are looking for

Median of two sorted arrays is rarely asked at the phone screen level — it is usually a senior or staff onsite question, where the interviewer wants to see how the candidate handles a problem that is genuinely hard. The signal layers:

Brute force first. Even strong candidates state the merge solution before attempting the binary search. Skipping straight to binary search often signals memorization.
Recognize the partition framing. The candidate articulates that the problem is about finding a partition, not searching for a value. This insight is the hardest single step.
Manage the binary search invariants. The candidate writes the search loop with correct termination conditions and handles boundaries with sentinels rather than scattered special cases.
Verify on edge cases. One array empty, both arrays of size 1, arrays with duplicate elements, very different array sizes (m=1, n=1000). A polished candidate runs through these out loud.

Most interviewers do not require the candidate to produce the optimal solution from scratch. The bar is usually “can the candidate get the merge solution clean, recognize the partition framing with a hint, and discuss the binary search approach coherently”. A candidate who delivers all three is at the senior bar regardless of whether they finish the optimal code in time.

Variations and follow-ups

K-th smallest in two sorted arrays. The generalization of the median problem. Same partition trick with a parameter k.
Median of N sorted arrays. Becomes a min-heap problem rather than binary search. Different technique.
Median of streaming numbers. The two-heap solution. Different problem entirely; tests whether the candidate distinguishes “static arrays” from “stream”.
Binary search invariant proofs. Some interviewers ask the candidate to argue that the binary search terminates correctly. Tests theoretical depth.

Is it asked in 2026?

Yes, but selectively. Median of two sorted arrays is a standard senior-level FAANG question. It appears in some AI lab interviews, in fintech tier 1, and in quant firms that use coding rounds. It is essentially never asked at the phone screen tier — too long, too many edge cases, too much risk of an unfair partial-credit outcome. When it does appear, the interviewer is calibrating for senior-staff-level depth, not pass/fail.

The problem has not lost its place in the canon despite being well-known, because its difficulty resists pure memorization. A candidate who has watched the YouTube walkthrough still has to write the boundary conditions correctly under live pressure, and that step is hard enough that the question continues to differentiate.

Frequently Asked Questions

What is the time complexity of the optimal solution?

O(log(min(m, n))). Binary searching the smaller array gives the tighter bound.

Can I use the merge solution in an interview?

At a phone screen, often yes. At a FAANG senior-level onsite, the interviewer will likely follow up with “can we do better than O(m+n)?”, and the candidate is expected to know the binary search approach exists, even if they do not finish coding it.

Why use ±infinity for the boundaries?

The boundary conditions (i=0, i=m, j=0, j=n) require special handling in the partition correctness check. Using ±infinity as sentinels means the standard left1 ≤ right2 and left2 ≤ right1 check works uniformly across all cases, eliminating special-case code.

Is this problem still in the LeetCode top 100?

Yes, it is LeetCode 4 and remains one of the most-discussed Hard problems on the platform.

What is a clean way to verify my solution?

Run it on edge cases: empty array, single-element arrays, arrays of very different sizes, arrays with duplicates, arrays where all elements of one are less than all of the other. If all of these pass, the partition logic is solid.