Math
Mark all multiples of each prime as composite up to √n. The remaining unmarked numbers are prime. Efficiently finds all primes up to n in O(n log log n).
Finding all prime numbers up to some limit n is a foundational task in competitive programming. A naive approach checks each number individually for primality by trial division, running in time. The Sieve of Eratosthenes solves the same problem in , effectively linear for practical purposes, by systematically eliminating composites rather than confirming primes.
Start by assuming every number from 2 to n is prime. Then, for each prime p you encounter, mark all of its multiples as composite. A number that survives unmarked after processing every prime up to must itself be prime, because it has no prime factor smaller than or equal to its square root.
The key optimization is starting the marking pass for prime p at p * p rather than 2 * p. Every multiple of p smaller than p * p, such as 2p, 3p, 4p, has a prime factor smaller than p and was already marked in an earlier pass. Starting at avoids redundant work and is what makes the algorithm efficient.
Begin with all numbers from 2 to 30 marked as prime.
p = 2: Mark 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30 as composite. (Start at 4 = 2*2.)
p = 3: 3 is still prime. Mark 9, 12, 15, 18, 21, 24, 27, 30. (Start at 9 = 3*3. Note 6 was already gone.)
p = 5: 5 is still prime. Mark 25, 30. (Start at 25 = 5*5. 10, 15, 20 were already gone.)
Stop: , so we check no further primes.
Remaining primes: 2, 3, 5, 7, 11, 13, 17, 19, 23, 29.
def sieve(n):
is_prime = [True] * (n + 1)
is_prime[0] = is_prime[1] = False
for i in range(2, int(n**0.5) + 1):
if is_prime[i]:
for j in range(i * i, n + 1, i):
is_prime[j] = False
return [i for i in range(2, n + 1) if is_prime[i]]The outer loop runs only up to . The inner loop starts at i * i and steps by i. The combined work across all primes sums to roughly , which converges to by properties of the sum of reciprocals of primes.
A common extension tracks the smallest prime factor (SPF) of every composite number during the sieve, rather than just a boolean. This allows any number up to n to be fully factorized in time with a simple loop.
def spf_sieve(n):
spf = list(range(n + 1)) # spf[i] = i initially (treat as own smallest factor)
for i in range(2, int(n**0.5) + 1):
if spf[i] == i: # i is prime
for j in range(i * i, n + 1, i):
if spf[j] == j: # not yet assigned a smaller factor
spf[j] = i
return spf
def factorize(x, spf):
factors = []
while x > 1:
factors.append(spf[x])
x //= spf[x]
return factorsIf a problem asks you to factorize many different numbers (all up to some limit n), build the SPF sieve once in , then each factorization query costs only . This is far superior to running trial division separately for each query.
The standard sieve allocates an array of size n. When n approaches or , memory becomes a constraint. The segmented sieve processes the range [lo, hi] in blocks, using only memory at any one time.
The approach: first sieve all primes up to using the standard sieve. Then, for each block [lo, lo + block_size), create a small boolean array and mark composites using each small prime. Only the small primes array and one block live in memory simultaneously.
def segmented_sieve(lo, hi):
limit = int(hi**0.5) + 1
small_primes = sieve(limit) # standard sieve for small primes
block_size = limit
result = []
for low in range(lo, hi + 1, block_size):
high = min(low + block_size - 1, hi)
is_prime = [True] * (high - low + 1)
if low == 0:
if len(is_prime) > 0: is_prime[0] = False
if low <= 1 <= high:
is_prime[1 - low] = False
for p in small_primes:
start = max(p * p, ((low + p - 1) // p) * p)
for j in range(start, high + 1, p):
is_prime[j - low] = False
for i in range(high - low + 1):
if is_prime[i]:
result.append(low + i)
return resultThe segmented sieve is the right tool when you need primes in a range like where building a full sieve up to would be impossible.
n. Build the boolean sieve, then count survivors.right, then scan for the closest pair of primes in [left, right]. The segmented sieve helps when right is large but the range right - left is small.n, then iterate x from 2 to n/2 checking whether both x and n - x are prime.The sieve is appropriate when a problem requires either all primes up to n, repeated primality checks for many different numbers, or fast factorization of many values. If you only need to check a single large number for primality, trial division up to its square root is simpler. If you need to check numbers in the range of or larger, deterministic Miller-Rabin is the right approach instead.
The standard sieve runs in time with space. The SPF sieve has the same asymptotic bounds but carries a slightly larger constant. The segmented sieve runs in time for the range and uses space, making it practical for ranges where a full sieve would exhaust memory.
You're probably looking at the Sieve of Eratosthenes when:
n with n around 10^6 to 10^7.Common templates:
x \leq n in O(log x). Example: Closest Prime Numbers in Range.p^2 to flag non-squarefree, or build totient / Möbius alongside primes. Example: Number of Squareful Arrays.[lo, hi] using primes up to sqrt(hi) when hi is too large to sieve fully. Example: Prime Pairs With Target Sum.Practice Problems (16)