Binary Search
Binary search on the partition point of the smaller array to find the correct split between both arrays. Ensures equal elements on each side for the median. Achieves O(log(min(m,n))).
You are given two sorted arrays nums1 and nums2 of sizes m and n. Find the median of the combined sorted order. The overall run-time complexity must be .
This is LeetCode 4, and the logarithmic requirement is what makes it hard. If the problem only asked for "find the median," you could merge the two arrays in time and pick the middle element. That works, but it does not satisfy the constraint. We need a fundamentally different approach, one built on binary search.
Merging two sorted arrays is a familiar operation. You maintain two pointers, compare elements, and advance the smaller one. After (m + n) / 2 steps you reach the median. The time is and the space is if you only track the last couple of values.
The trouble is that is linear in the total size. When the problem demands , it is asking you to ignore most of the elements entirely. You need to jump to the right place in a single binary search, not crawl there one element at a time.
Imagine cutting both arrays with a vertical line so that the left side contains exactly half the total elements and the right side contains the other half. If every element on the left is less than or equal to every element on the right, then the median sits right at the cut: it is either the maximum of the left side (when the total count is odd) or the average of the maximum of the left side and the minimum of the right side (when the total count is even).
Concretely, suppose you place the cut so that i elements come from A and j elements come from B, where i + j = (m + n + 1) // 2. The left side is A[0..i-1] and B[0..j-1]. The right side is A[i..m-1] and B[j..n-1].
Because both arrays are individually sorted, the only thing that could go wrong is at the boundary between the arrays. Specifically, you need two "cross conditions" to hold:
A[i-1] <= B[j]: the largest element taken from A on the left must not exceed the smallest element taken from B on the right.B[j-1] <= A[i]: the largest element taken from B on the left must not exceed the smallest element taken from A on the right.If both conditions hold, you have a valid partition and can read off the median.
You only have one degree of freedom. Once you choose i (how many elements to take from A), the value j is forced: j = (m + n + 1) // 2 - i. So you binary search on i in the range [0, m].
At each step:
A[i-1] > B[j], then A is contributing too many elements to the left side. Move the cut in A to the left: hi = i - 1.B[j-1] > A[i], then A is contributing too few elements. Move the cut to the right: lo = i + 1.You always binary search on the smaller array. If A is longer than B, swap them. This guarantees that j never goes negative (since i <= m <= n means j >= 0) and that the binary search runs in time.
When i = 0, no elements from A are on the left, so there is no A[i-1] to compare. Similarly, when i = m, no elements from A are on the right, so there is no A[i]. The clean way to handle this is with sentinels: treat A[-1] as negative infinity and A[m] as positive infinity. The same goes for B. This way the cross conditions always have well-defined values and edge partitions are handled without special-case logic.
Total elements: 3 (odd), so the median is the single middle element. We need (3 + 1) // 2 = 2 elements on the left.
A is smaller (m = 2, n = 1), so we binary search on A. lo = 0, hi = 2.
Iteration 1: i = (0 + 2) // 2 = 1, j = 2 - 1 = 1.
Left side: A[0] = 1, B[0] = 2. Right side: A[1] = 3, (B is exhausted, so B[1] = +inf).
Check: A[0] = 1 <= B[1] = +inf, yes. B[0] = 2 <= A[1] = 3, yes.
Valid partition. The maximum of the left side is max(1, 2) = 2. Since the total count is odd, the median is 2.
Total elements: 4 (even), so the median is the average of the 2nd and 3rd elements. We need (4 + 1) // 2 = 2 elements on the left.
A and B are equal length (m = 2, n = 2). Binary search on A. lo = 0, hi = 2.
Iteration 1: i = (0 + 2) // 2 = 1, j = 2 - 1 = 1.
Left side: A[0] = 1, B[0] = 3. Right side: A[1] = 2, B[1] = 4.
Check: A[0] = 1 <= B[1] = 4, yes. B[0] = 3 <= A[1] = 2, no.
B is contributing too much on the left, meaning A is contributing too little. Move right: lo = 2.
Iteration 2: i = (2 + 2) // 2 = 2, j = 2 - 2 = 0.
Left side: A[0] = 1, A[1] = 2. (B contributes nothing on left, so B[-1] = -inf.) Right side: B[0] = 3, B[1] = 4. (A is exhausted, so A[2] = +inf.)
Check: A[1] = 2 <= B[0] = 3, yes. B[-1] = -inf <= A[2] = +inf, yes.
Valid partition. Maximum of left: max(2, -inf) = 2. Minimum of right: min(+inf, 3) = 3. The median is (2 + 3) / 2 = 2.5.
def findMedianSortedArrays(nums1, nums2):
# Always binary search on the smaller array
if len(nums1) > len(nums2):
nums1, nums2 = nums2, nums1
m, n = len(nums1), len(nums2)
lo, hi = 0, m
half = (m + n + 1) // 2
while lo <= hi:
i = (lo + hi) // 2
j = half - i
# Sentinel values for edge partitions
left_A = nums1[i - 1] if i > 0 else float('-inf')
right_A = nums1[i] if i < m else float('inf')
left_B = nums2[j - 1] if j > 0 else float('-inf')
right_B = nums2[j] if j < n else float('inf')
if left_A > right_B:
# Too many from A on the left, move cut left
hi = i - 1
elif left_B > right_A:
# Too few from A on the left, move cut right
lo = i + 1
else:
# Valid partition
max_left = max(left_A, left_B)
if (m + n) % 2 == 1:
return max_left
min_right = min(right_A, right_B)
return (max_left + min_right) / 2The function starts by ensuring nums1 is the smaller array. It then binary searches over the partition index i in nums1. For each candidate, it computes j so that the left side has exactly half the total elements. The sentinels handle the cases where a partition sits at the very start or very end of an array. Once the cross conditions are satisfied, it reads the median directly from the boundary values.
If you binary-searched on the larger array instead, the algorithm would still be correct, but it would run in instead of . Worse, you could run into cases where j becomes negative because the larger array can contribute more than half the elements by itself. Searching on the smaller array avoids this entirely: since i ranges from 0 to m and m <= n, the derived j = half - i is always between 0 and n.
The median is just the special case where you want the element at position (m + n + 1) // 2. The same partition idea generalizes: to find the kth smallest element across both arrays, set half = k instead of (m + n + 1) // 2. You binary search for a partition with exactly k elements on the left side, check the same cross conditions, and return the maximum of the left side. Everything else, including the sentinel logic, the binary search bounds, and the swap to ensure you search the smaller array, remains identical.
The only subtlety is that i can now range up to min(m, k) instead of m, since you cannot take more than k elements from one array. And the lower bound is max(0, k - n), since if k > n you must take at least k - n elements from A.
Time: . The binary search runs on the smaller array, halving the range each step.
Space: . Only a constant number of variables are used regardless of input size.
You're probably looking at the median-of-two-sorted-arrays partition when:
A[i-1] <= B[j] and B[j-1] <= A[i].<= everything on the right."Common templates:
k, not just the middle; same partition logic with a shifted left-count target.Practice Problems (1)