← back

Math

Sieve of Eratosthenes

Mark all multiples of each prime as composite up to √n. The remaining unmarked numbers are prime. Efficiently finds all primes up to n in O(n log log n).

Sieve of Eratosthenes

Finding all prime numbers up to some limit n is a foundational task in competitive programming. A naive approach checks each number individually for primality by trial division, running in O(nn)O(n\sqrt{n}) time. The Sieve of Eratosthenes solves the same problem in O(nloglogn)O(n \log \log n), effectively linear for practical purposes, by systematically eliminating composites rather than confirming primes.

The Core Idea

Start by assuming every number from 2 to n is prime. Then, for each prime p you encounter, mark all of its multiples as composite. A number that survives unmarked after processing every prime up to n\sqrt{n} must itself be prime, because it has no prime factor smaller than or equal to its square root.

The key optimization is starting the marking pass for prime p at p * p rather than 2 * p. Every multiple of p smaller than p * p, such as 2p, 3p, 4p, has a prime factor smaller than p and was already marked in an earlier pass. Starting at p2p^2 avoids redundant work and is what makes the algorithm efficient.

Walked Example: Sieving Up to 30

Begin with all numbers from 2 to 30 marked as prime.

p = 2: Mark 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30 as composite. (Start at 4 = 2*2.)

p = 3: 3 is still prime. Mark 9, 12, 15, 18, 21, 24, 27, 30. (Start at 9 = 3*3. Note 6 was already gone.)

p = 5: 5 is still prime. Mark 25, 30. (Start at 25 = 5*5. 10, 15, 20 were already gone.)

Stop: 305.47\sqrt{30} \approx 5.47, so we check no further primes.

Remaining primes: 2, 3, 5, 7, 11, 13, 17, 19, 23, 29.

The Standard Implementation

1
2
3
4
5
6
7
8
def sieve(n):
    is_prime = [True] * (n + 1)
    is_prime[0] = is_prime[1] = False
    for i in range(2, int(n**0.5) + 1):
        if is_prime[i]:
            for j in range(i * i, n + 1, i):
                is_prime[j] = False
    return [i for i in range(2, n + 1) if is_prime[i]]

The outer loop runs only up to n\sqrt{n}. The inner loop starts at i * i and steps by i. The combined work across all primes pnp \leq \sqrt{n} sums to roughly n/2+n/3+n/5+n/7+n/2 + n/3 + n/5 + n/7 + \ldots, which converges to O(nloglogn)O(n \log \log n) by properties of the sum of reciprocals of primes.

Smallest Prime Factor Sieve

A common extension tracks the smallest prime factor (SPF) of every composite number during the sieve, rather than just a boolean. This allows any number up to n to be fully factorized in O(logn)O(\log n) time with a simple loop.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
def spf_sieve(n):
    spf = list(range(n + 1))      # spf[i] = i initially (treat as own smallest factor)
    for i in range(2, int(n**0.5) + 1):
        if spf[i] == i:           # i is prime
            for j in range(i * i, n + 1, i):
                if spf[j] == j:   # not yet assigned a smaller factor
                    spf[j] = i
    return spf

def factorize(x, spf):
    factors = []
    while x > 1:
        factors.append(spf[x])
        x //= spf[x]
    return factors

If a problem asks you to factorize many different numbers (all up to some limit n), build the SPF sieve once in O(nloglogn)O(n \log \log n), then each factorization query costs only O(logn)O(\log n). This is far superior to running trial division separately for each query.

Segmented Sieve for Large Ranges

The standard sieve allocates an array of size n. When n approaches 10810^8 or 10910^9, memory becomes a constraint. The segmented sieve processes the range [lo, hi] in blocks, using only O(n+block_size)O(\sqrt{n} + \text{block\_size}) memory at any one time.

The approach: first sieve all primes up to hi\sqrt{hi} using the standard sieve. Then, for each block [lo, lo + block_size), create a small boolean array and mark composites using each small prime. Only the small primes array and one block live in memory simultaneously.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
def segmented_sieve(lo, hi):
    limit = int(hi**0.5) + 1
    small_primes = sieve(limit)          # standard sieve for small primes
    block_size = limit
    result = []

    for low in range(lo, hi + 1, block_size):
        high = min(low + block_size - 1, hi)
        is_prime = [True] * (high - low + 1)

        if low == 0:
            if len(is_prime) > 0: is_prime[0] = False
        if low <= 1 <= high:
            is_prime[1 - low] = False

        for p in small_primes:
            start = max(p * p, ((low + p - 1) // p) * p)
            for j in range(start, high + 1, p):
                is_prime[j - low] = False

        for i in range(high - low + 1):
            if is_prime[i]:
                result.append(low + i)

    return result

The segmented sieve is the right tool when you need primes in a range like [1012,1012+106][10^{12}, 10^{12} + 10^6] where building a full sieve up to 101210^{12} would be impossible.

LeetCode Variants

When to Reach for the Sieve

The sieve is appropriate when a problem requires either all primes up to n, repeated primality checks for many different numbers, or fast factorization of many values. If you only need to check a single large number for primality, trial division up to its square root is simpler. If you need to check numbers in the range of 101210^{12} or larger, deterministic Miller-Rabin is the right approach instead.

Complexity Summary

The standard sieve runs in O(nloglogn)O(n \log \log n) time with O(n)O(n) space. The SPF sieve has the same asymptotic bounds but carries a slightly larger constant. The segmented sieve runs in O((hilo+1)logloghi)O((hi - lo + 1) \log \log hi) time for the range and uses O(hi+block_size)O(\sqrt{hi} + \text{block\_size}) space, making it practical for ranges where a full sieve would exhaust memory.

Recognizing this pattern

You're probably looking at the Sieve of Eratosthenes when:

  • You need to count or enumerate primes up to n with n around 10^6 to 10^7.
  • You need the smallest prime factor for every integer in a range so you can factor many numbers quickly.
  • The problem asks about squarefree integers, totient values, or anything that depends on factorizations of many small numbers.
  • You'd otherwise be calling a primality check inside a loop over a wide range: batch the work into one sieve instead.

Common templates:

  • Boolean primality sieve: mark composites by stepping through multiples of each prime. Example: Count Primes.
  • Smallest-prime-factor sieve: record the SPF of each integer to factor any x \leq n in O(log x). Example: Closest Prime Numbers in Range.
  • Squarefree / multiplicative-function sieve: sweep multiples of p^2 to flag non-squarefree, or build totient / Möbius alongside primes. Example: Number of Squareful Arrays.
  • Segmented sieve: sieve a range [lo, hi] using primes up to sqrt(hi) when hi is too large to sieve fully. Example: Prime Pairs With Target Sum.