← back

Heap / Priority Queue

Top-K Elements

Maintain a min-heap of size k. For each new element, push it and pop the smallest if size exceeds k. The heap contains the k largest elements at the end. For k-th smallest, use a max-heap.

Finding Top-K Elements with Heaps

The "top-K" family of problems asks you to efficiently extract the K largest values, the K most frequent elements, or the K closest points from a larger collection. Sorting works, but it costs O(nlogn)O(n \log n) regardless of K. When K is much smaller than n, a heap-based approach cuts that to O(nlogK)O(n \log K), a meaningful win on large inputs.

The Counter-Intuitive Min-Heap Trick

The canonical version, LeetCode 215, asks for the Kth largest element of an array. The natural instinct is to use a max-heap to track the largest elements: keep pulling off the top and you get the biggest values. But that approach requires storing all nn elements and then extracting K of them. The smarter move is a size-K min-heap.

Here is the insight: if you maintain a min-heap of exactly K elements, the root is always the smallest of the K largest seen so far, the threshold that a new element must beat to deserve a spot. For every new element you see, compare it against the root. If the new element is larger, evict the root and push the new element in. At the end, the heap contains exactly the K largest elements, and you never needed more than K slots.

1
2
3
4
5
6
7
8
9
import heapq

def kth_largest(nums: list[int], k: int) -> int:
    heap = []
    for num in nums:
        heapq.heappush(heap, num)
        if len(heap) > k:
            heapq.heappop(heap)   # evict the smallest-of-K
    return heap[0]               # root is the Kth largest

Walk through nums = [3, 1, 5, 12, 2, 11] with k = 3:

  • Push 3 → heap: [3]
  • Push 1 → heap: [1, 3]
  • Push 5 → heap: [1, 3, 5]
  • Push 12, size exceeds 3, pop 1 → heap: [3, 5, 12]
  • Push 2, size exceeds 3, pop 2 → heap: [3, 5, 12] (2 was immediately evicted)
  • Push 11, size exceeds 3, pop 3 → heap: [5, 11, 12]

The root 5 is the 3rd largest. The heap contains {5, 11, 12}, which are indeed the three largest elements.

Complexity

Each of the nn elements is pushed once and popped at most once. Each heap operation on a size-K heap costs O(logK)O(log K). Total: O(nlogK)O(n \log K) time, O(K)O(K) space. When K is small (say, K = 10 out of n = 1,000,000), this is substantially faster than sorting.

Frequency Variant: Top K Frequent Elements

For LeetCode 347, when the problem asks for the K most frequent values, build a frequency map first and then run the same size-K heap logic over the counts.

1
2
3
4
5
6
7
8
9
10
11
12
from collections import Counter
import heapq

def top_k_frequent(nums: list[int], k: int) -> list[int]:
    freq = Counter(nums)
    # heap of (count, element); min-heap evicts least frequent
    heap = []
    for elem, count in freq.items():
        heapq.heappush(heap, (count, elem))
        if len(heap) > k:
            heapq.heappop(heap)
    return [elem for count, elem in heap]

For nums = [1,1,1,2,2,3] with k = 2, the frequency map is {1:3, 2:2, 3:1}. The heap processes all three entries. After all insertions the heap holds the two entries with the highest counts: (2, 2) and (3, 1), having evicted count 1. The result is elements 1 and 2, which are the two most frequent. Correct.

Python's heapq.nlargest(k, freq, key=freq.get) is a one-liner that does the same thing under the hood and is perfectly acceptable in interviews.

K Closest Points to the Origin

LeetCode 973 shows how the same pattern applies to geometric problems. Compute each point's squared distance (no need for the square root; it preserves ordering) and use a max-heap of size K, or equivalently, a min-heap with negated distances.

1
2
3
4
5
6
7
8
9
10
11
import heapq

def k_closest(points: list[list[int]], k: int) -> list[list[int]]:
    # max-heap: store (-distance, point) so the farthest sits at root
    heap = []
    for x, y in points:
        dist = x * x + y * y
        heapq.heappush(heap, (-dist, [x, y]))
        if len(heap) > k:
            heapq.heappop(heap)   # evict the farthest
    return [point for _, point in heap]

Here the heap evicts the farthest point seen so far, leaving only the K closest at the end. Negating the distance turns Python's min-heap into a max-heap for this comparison key.

Python's heapq and the Negation Pattern

Python's heapq module only provides a min-heap. Two strategies handle max-heap needs:

  • Negate values: store -x instead of x; the smallest negative corresponds to the largest original value.
  • Use heapq.nlargest or heapq.nsmallest: these are convenient for one-shot extraction but internally sort, so they are O(nlogK)O(n \log K), the same asymptotic complexity as the manual approach.

For the Kth largest element specifically, there is also a Quickselect solution that achieves O(n)O(n) average time, but the heap approach is simpler to implement under interview pressure and has deterministic O(nlogK)O(n \log K) worst-case behavior.

Kth Smallest in a Sorted Matrix

With LeetCode 378, rows and columns are each sorted, so you can treat the matrix like K sorted lists and merge them with a min-heap. Seed the heap with the first element of every row, then pop the minimum K times. Each pop advances one row's pointer, and you push the next element from that same row.

1
2
3
4
5
6
7
8
9
10
11
12
13
import heapq

def kth_smallest_matrix(matrix: list[list[int]], k: int) -> int:
    n = len(matrix)
    # (value, row, col)
    heap = [(matrix[r][0], r, 0) for r in range(n)]
    heapq.heapify(heap)

    for _ in range(k):
        val, r, c = heapq.heappop(heap)
        if c + 1 < n:
            heapq.heappush(heap, (matrix[r][c + 1], r, c + 1))
    return val

For a 3x3 matrix the heap starts with the leftmost column. Each pop returns the global minimum of all frontier elements, and after K pops you have the Kth smallest. Time is O(Klogn)O(K \log n) where n is the number of rows, since the heap never exceeds size n.

Find K Pairs with Smallest Sums

LeetCode 373 extends this idea: given two sorted arrays nums1 and nums2, find the K pairs (a, b) with the smallest a + b. Seed a min-heap with every (nums1[i] + nums2[0], i, 0), then each time you pop a pair at index (i, j), push (nums1[i] + nums2[j+1], i, j+1) to explore the next candidate.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
import heapq

def k_smallest_pairs(nums1: list[int], nums2: list[int], k: int) -> list[list[int]]:
    if not nums1 or not nums2:
        return []
    heap = [(nums1[i] + nums2[0], i, 0) for i in range(min(k, len(nums1)))]
    heapq.heapify(heap)

    result = []
    while heap and len(result) < k:
        total, i, j = heapq.heappop(heap)
        result.append([nums1[i], nums2[j]])
        if j + 1 < len(nums2):
            heapq.heappush(heap, (nums1[i] + nums2[j + 1], i, j + 1))
    return result

The key insight is that if (i, j) is a valid pair, then (i, j+1) is the next cheapest extension along that row. We never need to push (i+1, j) because every row's j=0 entry is already in the initial seed. This keeps the heap at most size K and runs in O(KlogK)O(K \log K) time.

IPO: Greedy with Two Heaps

Some problems combine a selection constraint with an optimization objective. In the IPO problem (LC 502), you have starting capital w, and each project has a minimum capital requirement and a profit. You can complete at most K projects sequentially, and finishing a project adds its profit to your capital.

Sort projects by capital requirement. Use a pointer to sweep through them: as your capital grows, push newly affordable projects onto a max-heap keyed by profit. Each round, pop the most profitable affordable project.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
import heapq

def find_maximized_capital(k: int, w: int, profits: list[int], capital: list[int]) -> int:
    projects = sorted(zip(capital, profits))
    max_heap = []
    i = 0

    for _ in range(k):
        while i < len(projects) and projects[i][0] <= w:
            heapq.heappush(max_heap, -projects[i][1])
            i += 1
        if not max_heap:
            break
        w += -heapq.heappop(max_heap)
    return w

The sorted order ensures you only consider each project once. The max-heap (via negation) always surfaces the best available profit. Total time is O(nlogn)O(n \log n) for sorting plus O(nlogn)O(n \log n) for heap operations.

Minimum Number of Refueling Stops

Take LeetCode 871: a car starts with some fuel and needs to reach a target distance. Along the way are gas stations, each at a known position with a known fuel amount. Find the minimum number of stops needed to reach the target.

The greedy insight: drive as far as you can. When you run out of fuel, "retroactively" refuel at the best station you already passed. A max-heap stores the fuel amounts of all passed stations, so each refuel grabs the maximum available.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
import heapq

def min_refuel_stops(target: int, start_fuel: int, stations: list[list[int]]) -> int:
    max_heap = []
    fuel = start_fuel
    prev = 0
    stops = 0

    for pos, gas in stations + [[target, 0]]:
        fuel -= (pos - prev)
        while fuel < 0 and max_heap:
            fuel += -heapq.heappop(max_heap)
            stops += 1
        if fuel < 0:
            return -1
        heapq.heappush(max_heap, -gas)
        prev = pos
    return stops

By deferring the refueling decision, we always pick the station that gives the most fuel. This greedy choice is provably optimal. The heap holds at most n station entries, so the algorithm runs in O(nlogn)O(n \log n) time.

Least Number of Unique Integers after K Removals

LeetCode 1481 adds a twist: given an array of integers and a budget K, remove exactly K elements to minimize the number of distinct values remaining. The strategy: count frequencies, then greedily remove elements starting with the least frequent values first. Removing all copies of a rare value eliminates one distinct value at the cheapest cost.

1
2
3
4
5
6
7
8
9
10
11
12
from collections import Counter

def find_least_num_of_unique_ints(arr: list[int], k: int) -> int:
    freq = Counter(arr)
    counts = sorted(freq.values())

    for i, count in enumerate(counts):
        if k >= count:
            k -= count
        else:
            return len(counts) - i
    return 0

Sorting the frequency counts and removing from the bottom is equivalent to using a min-heap on frequencies. Either way, you spend your removal budget where it eliminates the most distinct values. Time is O(nlogn)O(n \log n) for sorting the counts.

Recognizing this pattern

You're probably looking at the top-K pattern when:

  • The problem says "top K", "Kth largest", "Kth smallest", "K most frequent", or "K closest", and K is a small parameter compared to n.
  • You only need the K best elements, not a full sort, and K is bounded enough that a size-K heap fits comfortably in memory.
  • The data may be streamed and you cannot revisit older elements, so you must maintain the top K online.
  • Average O(n)O(n) via quickselect is acceptable when the problem allows in-place partitioning of an array.

Common templates:

Practice Problems (56)

215 Kth Largest Element in an Array
347 Top K Frequent Elements
373 Find K Pairs with Smallest Sums
378 Kth Smallest Element in a Sorted Matrix
414 Third Maximum Number
502 IPO
703 Kth Largest Element in a Stream
786 K-th Smallest Prime Fraction
857 Minimum Cost to Hire K Workers
871 Minimum Number of Refueling Stops
973 K Closest Points to Origin
1046 Last Stone Weight
1337 The K Weakest Rows in a Matrix
1354 Construct Target Array With Multiple Sums
1383 Maximum Performance of a Team
1481 Least Number of Unique Integers after K Removals
1642 Furthest Building You Can Reach
1675 Minimize Deviation in Array
1705 Maximum Number of Eaten Apples
1792 Maximum Average Pass Ratio
1801 Number of Orders in the Backlog
1825 Finding MK Average
1845 Seat Reservation Manager
1912 Design Movie Rental System
1942 The Number of the Smallest Unoccupied Chair
1962 Remove Stones to Minimize the Total
2034 Stock Price Fluctuation
2099 Find Subsequence of Length K With the Largest Sum
2102 Sequentially Ordinal Rank Tracker
2163 Minimum Difference in Sums After Removal of Elements
2208 Minimum Operations to Halve Array Sum
2233 Maximum Product After K Increments
2336 Smallest Number in Infinite Set
2353 Design a Food Rating System
2386 Find the K-Sum of an Array
2462 Total Cost to Hire K Workers
2497 Maximum Star Sum of a Graph
2503 Maximum Number of Points From Grid Queries
2512 Reward Top K Students
2530 Maximal Score After Applying K Operations
2542 Maximum Subsequence Score
2558 Take Gifts From the Richest Pile
3066 Minimum Operations to Exceed Threshold Value II
3075 Maximize Happiness of Selected Children
3080 Mark Elements on Array by Performing Queries
3092 Most Frequent IDs
3170 Lexicographically Minimum String After Removing Stars
3275 K-th Nearest Obstacle Queries
3408 Design Task Manager
3462 Maximum Sum With at Most K Elements
3478 Choose K Elements With Maximum Sum
3572 Maximize Y‑Sum by Picking a Triplet of Distinct X‑Values
3767 Maximize Points After Choosing K Tasks
3781 Maximum Score After Binary Swaps
3815 Design Auction System
3829 Design Ride Sharing System