Heap / Priority Queue
Maintain a max-heap for the lower half and a min-heap for the upper half. Balance them to differ by at most 1. The median is the top of the larger heap, or the average of both tops.
LeetCode 295 motivates the two-heap approach. Finding the median of a static sorted array is trivial: pick the middle element. The challenge arises when numbers arrive one at a time from a stream and you need to report the current median after each insertion, without re-sorting the whole collection every time.
The key observation is that you do not need the full sorted order. You only need access to the one or two elements closest to the middle. Two heaps give you exactly that.
Split the stream into a lower half and an upper half:
lo is a max-heap holding the smaller half of all numbers seen so far. Its root is the largest element in the lower half.hi is a min-heap holding the larger half. Its root is the smallest element in the upper half.Maintain two invariants:
lo is less than or equal to every element in hi.len(lo) is either equal to len(hi) or exactly one more.When these invariants hold, the median is always at the boundary. If the heaps are equal in size, the median is the average of the two roots. If lo is larger by one, the median is lo's root.
A clean two-step approach keeps the invariants intact without requiring a long chain of conditionals.
Step 1: Push the new number onto lo. Since lo is a max-heap, this places the number in the lower partition, but lo's new root might now be larger than hi's root, violating invariant 1.
Step 2: Fix invariant 1 by moving lo's root to hi. This guarantees the cross-partition ordering is correct.
After step 2, hi might be larger than lo, violating invariant 2. If so, move hi's root back to lo.
This unconditional push-then-rebalance pattern handles all cases: numbers smaller than the current median, numbers larger, and numbers that fall exactly at the median.
Stream: [5, 15, 1, 3]
Insert 5:
[5], hi: [][], hi: [5][5], hi: []Insert 15:
[15, 5] (max-heap, 15 at root), hi: [][5], hi: [15]Insert 1:
[5, 1], hi: [15][1], hi: [5, 15][5, 1], hi: [15]Insert 3:
[5, 1, 3], hi: [15][3, 1], hi: [5, 15]The sorted sequence at this point is [1, 3, 5, 15]. The true median is (3 + 5) / 2 = 4.0. Correct.
import heapq
class MedianFinder:
def __init__(self):
self.lo = [] # max-heap for lower half (negate values)
self.hi = [] # min-heap for upper half
def addNum(self, num: int) -> None:
# Step 1: push to lower half
heapq.heappush(self.lo, -num)
# Step 2: balance cross-partition order
heapq.heappush(self.hi, -heapq.heappop(self.lo))
# Step 3: rebalance sizes (lo may have one more, never hi)
if len(self.hi) > len(self.lo):
heapq.heappush(self.lo, -heapq.heappop(self.hi))
def findMedian(self) -> float:
if len(self.lo) > len(self.hi):
return float(-self.lo[0])
return (-self.lo[0] + self.hi[0]) / 2.0Python's heapq module is a min-heap only. Simulating a max-heap requires negating all stored values: pushing -num means the most negative value (i.e., the largest original value) sits at the root. When you pop, negate again to recover the original value.
This is why every push to lo uses -num and every read from lo's root uses -self.lo[0]. The hi heap stores values as-is since it is already a min-heap.
Each insertion performs a constant number of heap pushes and pops on heaps of size . Each heap operation is , so insertion is . Reading the median is since it only inspects the roots. Space is to store all elements.
LeetCode 480 adds a twist: a harder variant maintains the running median over a fixed-size sliding window: as the window advances, you must remove the element that falls out while adding the new one. Heaps do not support arbitrary removal efficiently, so the standard solution uses lazy deletion.
Keep a dictionary tracking which elements have been "logically deleted," meaning elements that have left the window but are still physically in a heap. When you pop from a heap, check whether the root is in the deleted set; if so, discard it and pop again. This does not require any upfront restructuring; the deleted entries simply get skipped the next time they bubble to the top.
import heapq
from collections import defaultdict
def medianSlidingWindow(nums: list[int], k: int) -> list[float]:
lo: list[int] = [] # max-heap (negated values)
hi: list[int] = [] # min-heap
deleted: dict[int, int] = defaultdict(int)
def get_valid_top(heap: list, negate: bool) -> float:
while heap:
val = -heap[0] if negate else heap[0]
if deleted[val] > 0:
deleted[val] -= 1
heapq.heappop(heap)
else:
break
return -heap[0] if negate else heap[0]
# seed first window
for num in nums[:k]:
heapq.heappush(lo, -num)
for _ in range(k // 2):
heapq.heappush(hi, -heapq.heappop(lo))
results = []
def current_median() -> float:
lo_top = get_valid_top(lo, negate=True)
if k % 2 == 1:
return float(lo_top)
hi_top = get_valid_top(hi, negate=False)
return (lo_top + hi_top) / 2.0
results.append(current_median())
for i in range(k, len(nums)):
incoming, outgoing = nums[i], nums[i - k]
deleted[outgoing] += 1
# add incoming to the correct side and rebalance
if incoming <= get_valid_top(lo, negate=True):
heapq.heappush(lo, -incoming)
else:
heapq.heappush(hi, incoming)
# rebalance sizes accounting for lazy deletions
while len(lo) - len(hi) > (k + 1) // 2:
heapq.heappush(hi, -heapq.heappop(lo))
while len(hi) > len(lo):
heapq.heappush(lo, -heapq.heappop(hi))
results.append(current_median())
return resultsThe lazy deletion approach preserves overall complexity. Each element is pushed once, lazily deleted once, and eventually popped once: three heap operations of each, over n elements.
You're probably looking at the two-heap pattern when:
Common templates:
|len(lo) - len(hi)| <= 1, median is one or both roots. Example: Find Median from Data Stream.Practice Problems (2)