Prefix Sum
Build a prefix sum array where prefix[i] = sum of elements from index 0 to i. Any range [l, r] can then be answered in O(1) as prefix[r] − prefix[l-1]. Extends to 2D matrices.
Imagine you have an array of numbers and a stream of thousands of queries, each asking: "What is the sum of elements from index l to index r?" If you answer each query by iterating from l to r and adding up the elements, each query costs time. With q queries, the total work is , and for large arrays with many queries, this is far too slow. The prefix sum technique eliminates this bottleneck entirely: preprocessing, then per query, no matter how many queries arrive.
The naive approach is straightforward. For each query (l, r), loop from l to r and accumulate the sum:
def range_sum_brute(nums, l, r):
total = 0
for i in range(l, r + 1):
total += nums[i]
return totalThis is fine for a handful of queries on a small array. But if the array has 100,000 elements and you receive 100,000 queries, you are looking at up to 10 billion operations. We need a way to preprocess the array so that each query can be answered in constant time.
The idea is simple: build an auxiliary array where each entry stores the cumulative sum of all elements from the beginning of the array up to (but not including) that index. Formally, define prefix[i] as the sum of the first i elements. That is, prefix[0] = 0, prefix[1] = nums[0], prefix[2] = nums[0] + nums[1], and so on. Then the sum of any range [l, r] is just prefix[r + 1] - prefix[l].
Why does this work? prefix[r + 1] is the sum of elements from index 0 through index r. prefix[l] is the sum of elements from index 0 through index l - 1. Subtracting the second from the first cancels out the elements before l, leaving exactly the sum from l to r.
Take the array nums = [1, 3, 5, 7, 9]. We build the prefix sum array step by step:
prefix[0] = 0 (the sum of zero elements)prefix[1] = prefix[0] + nums[0] = 0 + 1 = 1prefix[2] = prefix[1] + nums[1] = 1 + 3 = 4prefix[3] = prefix[2] + nums[2] = 4 + 5 = 9prefix[4] = prefix[3] + nums[3] = 9 + 7 = 16prefix[5] = prefix[4] + nums[4] = 16 + 9 = 25So prefix = [0, 1, 4, 9, 16, 25]. Now let us answer some queries:
prefix[4] - prefix[1] = 16 - 1 = 15. Check: 3 + 5 + 7 = 15.prefix[5] - prefix[0] = 25 - 0 = 25. Check: 1 + 3 + 5 + 7 + 9 = 25.prefix[3] - prefix[2] = 9 - 4 = 5. Check: nums[2] = 5.Every query is a single subtraction, which is .
LeetCode 303. Range Sum Query - Immutable is the textbook problem here. You preprocess once and answer many queries.
class NumArray:
def __init__(self, nums):
self.prefix = [0]
for num in nums:
self.prefix.append(self.prefix[-1] + num)
def sumRange(self, l, r):
return self.prefix[r + 1] - self.prefix[l]The constructor runs in time and space. Each call to sumRange runs in time.
You might wonder why we start the prefix array with a zero rather than with nums[0]. The reason is uniformity. If we defined prefix[i] as the sum of elements 0 through i (with no leading zero), then the formula for range_sum(l, r) would be prefix[r] - prefix[l - 1], but when l = 0, we would need prefix[-1], which does not exist. We would have to special-case it: "if l is 0, return prefix[r]; otherwise return prefix[r] - prefix[l-1]." The leading zero eliminates this branch entirely. With prefix[0] = 0, the formula prefix[r + 1] - prefix[l] works for every valid range, including ranges that start at index 0.
This is LeetCode 304. Range Sum Query 2D - Immutable. The same idea extends to two dimensions. Suppose you have a 2D grid (matrix) of numbers and you want to answer queries of the form "what is the sum of all elements in the rectangle from (r1, c1) to (r2, c2)?" The brute force approach iterates over the entire rectangle, costing per query. With a 2D prefix sum, each query is .
Define prefix[i][j] as the sum of all elements in the rectangle from (0, 0) to (i-1, j-1). Building it uses the inclusion-exclusion principle:
prefix[i][j] = matrix[i-1][j-1] + prefix[i-1][j] + prefix[i][j-1] - prefix[i-1][j-1]
We add the top region and the left region, but their overlap (the top-left rectangle) gets counted twice, so we subtract it once.
Consider this 3x3 grid:
We build a 4x4 prefix array (one extra row and column of zeros):
prefix[0][*] = [0, 0, 0, 0]prefix[1][*] = [0, 1, 3, 6] (cumulative sums of row 0)prefix[2][*] = [0, 5, 12, 21] (adding row 1)prefix[3][*] = [0, 12, 27, 45] (adding row 2)To query the sum of the rectangle from (1, 1) to (2, 2), which contains elements 5, 6, 8, 9, we use:
sum = prefix[3][3] - prefix[1][3] - prefix[3][1] + prefix[1][1]
sum = 45 - 6 - 12 + 1 = 28
Check: 5 + 6 + 8 + 9 = 28. The formula subtracts the region above the rectangle and the region to its left, then adds back the top-left corner that was subtracted twice.
class PrefixSum2D:
def __init__(self, matrix):
if not matrix or not matrix[0]:
self.prefix = [[0]]
return
m, n = len(matrix), len(matrix[0])
self.prefix = [[0] * (n + 1) for _ in range(m + 1)]
for i in range(1, m + 1):
for j in range(1, n + 1):
self.prefix[i][j] = (
matrix[i - 1][j - 1]
+ self.prefix[i - 1][j]
+ self.prefix[i][j - 1]
- self.prefix[i - 1][j - 1]
)
def query(self, r1, c1, r2, c2):
return (
self.prefix[r2 + 1][c2 + 1]
- self.prefix[r1][c2 + 1]
- self.prefix[r2 + 1][c1]
+ self.prefix[r1][c1]
)Building the 2D prefix array takes time and space. Each query is .
Prefix sums assume the underlying array is static, meaning it does not change between queries. If the array is mutable (elements can be updated between queries), a prefix sum breaks down because a single point update forces an rebuild of the suffix of the prefix array. This is exactly the setup of LeetCode 307. Range Sum Query - Mutable, and it forces a switch to a more sophisticated structure.
A Fenwick tree (binary indexed tree) supports point updates and prefix-sum queries in each, with space and a remarkably short implementation. A segment tree is more general: it supports arbitrary range queries (sum, min, max, gcd) and, with lazy propagation, range updates as well, all in . The trade-off is that prefix sums give you queries but no updates; Fenwick and segment trees give you on both. For LeetCode 308. Range Sum Query 2D - Mutable, a 2D Fenwick tree achieves per update and per query.
Prefix sums rely on the operation having an inverse. Addition works because subtraction undoes it: to get the sum of a range, we subtract one prefix from another. XOR works too, because XOR is its own inverse: prefix_xor[r+1] XOR prefix_xor[l] gives the XOR of the range [l, r]. But maximum and minimum do not have inverses. Knowing the max of [0, r] and the max of [0, l-1] does not tell you the max of [l, r]. For range max/min queries on a static array, you need a sparse table ( preprocessing, per query) or a segment tree.
For 1D prefix sums: building the prefix array takes time and space. Each range sum query takes time. Answering q queries after preprocessing costs total, compared to for the brute force approach.
For 2D prefix sums: building the prefix matrix takes time and space. Each rectangle sum query takes time. This extends naturally to higher dimensions, though the inclusion-exclusion formula grows combinatorially: a 3D prefix sum query involves 8 terms instead of 4.
The prefix sum technique is one of the most fundamental tools in competitive programming and interview preparation. Whenever you see repeated range queries on a static array, reach for prefix sums first. The pattern is simple, the implementation is short, and the performance improvement, from per query to per query, is dramatic.
You're probably looking at a prefix sum range query when:
Common templates:
prefix[i + 1] = prefix[i] + nums[i], answer in . Example: Range Sum Query - Immutable.prefix[r2+1][c2+1] - prefix[r1][c2+1] - prefix[r2+1][c1] + prefix[r1][c1]. Example: Range Sum Query 2D - Immutable.Practice Problems (50)