Prime Numbers: Sieve of Eratosthenes, Trial Division, Miller-Rabin

Prime Numbers: Sieve of Eratosthenes, Trial Division, and the Miller-Rabin Test

Prime number problems appear at FAANG interviews, fintech screens, and competitive programming. The two foundational tasks are: (1) check whether a single number is prime, and (2) find all primes up to N. Each has a classic algorithm — trial division for the first, Sieve of Eratosthenes for the second — and several optimizations and variants. This guide covers the standard approaches with working code, when to use each, and the variations interviewers ask as follow-ups.

Problem 1: Check Whether a Single Number Is Prime

Approach: Trial Division (Standard Answer)

Test divisibility by all integers from 2 up to sqrt(n). If any divides n, n isn’t prime.

def is_prime(n: int) -> bool:
    """O(sqrt(n)) time, O(1) space."""
    if n < 2:
        return False
    if n < 4:
        return True  # 2 and 3
    if n % 2 == 0:
        return False
    # Only test odd divisors from 3
    i = 3
    while i * i <= n:
        if n % i == 0:
            return False
        i += 2
    return True


# Tests
print(is_prime(1))    # False
print(is_prime(2))    # True
print(is_prime(17))   # True
print(is_prime(100))  # False
print(is_prime(997))  # True

Complexity: O(√n) time, O(1) space.

Why √n? If n has a divisor d > √n, then n / d is a divisor < √n. So if no divisor exists below √n, no divisor exists at all — n is prime.

Optimization: 6k ± 1

Every prime > 3 has the form 6k ± 1 (others are divisible by 2 or 3). Testing only these candidates skips two-thirds of the work.

def is_prime_6k(n: int) -> bool:
    """Optimized trial division using 6k ± 1 form."""
    if n < 2:
        return False
    if n < 4:
        return True
    if n % 2 == 0 or n % 3 == 0:
        return False
    i = 5
    while i * i <= n:
        if n % i == 0 or n % (i + 2) == 0:
            return False
        i += 6
    return True

Same complexity, smaller constant factor. Worth knowing for competitive-programming-flavored interviews.

Problem 2: Find All Primes Up to N

Approach: Sieve of Eratosthenes (Standard Answer)

Mark all composites up to N. Start at 2; mark every multiple of 2 as composite. Move to the next unmarked number; mark its multiples; repeat.

def sieve_of_eratosthenes(n: int) -> list[int]:
    """O(n log log n) time, O(n) space."""
    if n < 2:
        return []
    is_prime = [True] * (n + 1)
    is_prime[0] = is_prime[1] = False
    for i in range(2, int(n ** 0.5) + 1):
        if is_prime[i]:
            # Start marking from i*i (smaller multiples already marked)
            for j in range(i * i, n + 1, i):
                is_prime[j] = False
    return [i for i in range(n + 1) if is_prime[i]]


# Test
print(sieve_of_eratosthenes(30))
# [2, 3, 5, 7, 11, 13, 17, 19, 23, 29]

Complexity: O(n log log n) time, O(n) space.

Why start marking from i² rather than 2i? Multiples 2i, 3i, …, (i-1)i have already been marked by smaller primes. Starting at i² is the first multiple of i not yet marked, saving roughly half the work.

Optimization: Linear Sieve

O(n) time using each composite exactly once. More complex to implement; useful for n > 10⁷ where the log log factor matters.

def linear_sieve(n: int) -> list[int]:
    """O(n) time."""
    is_prime = [True] * (n + 1)
    is_prime[0] = is_prime[1] = False
    primes = []
    smallest_prime = [0] * (n + 1)
    for i in range(2, n + 1):
        if is_prime[i]:
            primes.append(i)
            smallest_prime[i] = i
        for p in primes:
            if p > smallest_prime[i] or i * p > n:
                break
            is_prime[i * p] = False
            smallest_prime[i * p] = p
    return primes

Each composite is marked by its smallest prime factor exactly once, giving O(n) total marking operations.

Problem 3: Probabilistic Primality (Miller-Rabin)

For very large n (cryptographic sizes — 1024+ bits), trial division is too slow even at O(√n). The Miller-Rabin test runs in O(k log³ n) where k is the number of rounds, with error probability ≤ 4^(-k).

import random

def miller_rabin(n: int, k: int = 20) -> bool:
    """Probabilistic primality test. Error <= 4^(-k)."""
    if n < 2:
        return False
    if n in (2, 3):
        return True
    if n % 2 == 0:
        return False

    # Write n-1 as 2^r × d
    r, d = 0, n - 1
    while d % 2 == 0:
        r += 1
        d //= 2

    for _ in range(k):
        a = random.randrange(2, n - 1)
        x = pow(a, d, n)
        if x == 1 or x == n - 1:
            continue
        for _ in range(r - 1):
            x = pow(x, 2, n)
            if x == n - 1:
                break
        else:
            return False
    return True

For interview purposes, mention Miller-Rabin if the interviewer asks about cryptographic-scale primes. For typical interview-sized n, trial division or sieve is sufficient.

Common Variations

Count primes less than n

(LeetCode #204) Apply the sieve, count True entries up to n. The sieve is faster than testing each number individually.

Find the nth prime

Sieve up to a generous upper bound (rough estimate: nth prime ≈ n × ln(n) for large n). Index into the list of primes.

Goldbach’s conjecture / sum of two primes

Unproven for all even n > 2, but verified up to extremely large values. For interview-sized inputs, sieve up to n then check pairs.

Prime factorization

Trial-divide by small primes from the sieve until n becomes 1 or until √n. Returns the multiset of prime factors.

def factorize(n: int) -> list[int]:
    factors = []
    i = 2
    while i * i <= n:
        while n % i == 0:
            factors.append(i)
            n //= i
        i += 1
    if n > 1:
        factors.append(n)
    return factors

Twin primes

Pairs (p, p+2) where both are prime. Found via sieve plus pairwise check.

Common Mistakes

  • Forgetting to handle n < 2. 0 and 1 are not prime. Returning True for 1 is a common bug.
  • Using i <= sqrt(n) with floats. Floating-point sqrt can have rounding issues at boundary values. Use i * i <= n with integer arithmetic, or int(n ** 0.5) + 1 with care.
  • Sieve indexing errors. Off-by-one in the array size; should be n + 1 for indices 0..n. Forgetting to mark 0 and 1 as composite.
  • Marking from 2i instead of i². Correct but does redundant work. The optimization to start at i² is standard.
  • Using trial division when sieve is faster. If you need many primes up to N, sieve once. Calling is_prime in a loop is asymptotically slower.
  • Forgetting Miller-Rabin’s probabilistic nature. With enough rounds (k = 20–40), the error probability is negligible, but it’s not zero. For cryptographic guarantees, use proven-deterministic variants (with specific witnesses for bounded n).

Frequently Asked Questions

What’s the expected interview answer?

For a single primality check: trial division to √n, with the 6k±1 optimization if you have time. For finding all primes up to N: Sieve of Eratosthenes. State the complexities. Strong candidates also discuss when to switch to Miller-Rabin (very large n) and the linear sieve (very large N where the log log factor matters).

Why is the sieve faster than calling is_prime in a loop?

Sieve is O(n log log n); a loop of is_prime calls is O(n × √n) = O(n^1.5). For n = 10⁶, sieve is ~3.5M operations, loop is ~10⁹ operations. The sieve reuses work; the loop redoes work. For finding primes up to N, sieve always wins.

How do interviewers test primality knowledge?

Most often: “implement is_prime” or “find all primes up to N.” Less often: probabilistic primality, prime factorization, Goldbach-flavored problems. The expected answers for the basics are well-known; interviewers care that you write them cleanly and discuss complexity correctly.

How does this relate to cryptography?

RSA, ElGamal, and similar use very large primes (1024–4096 bits). Generation is randomized with primality testing via Miller-Rabin. The math behind RSA security is the difficulty of factoring composite numbers; primality testing is fast (polynomial), factoring is hard (sub-exponential at best). Strong candidates for cryptography-flavored roles know this distinction.

What about the AKS primality test?

AKS (2002) is the first deterministic polynomial-time primality test. Asymptotically O(log^6 n) — better than trial division, worse than Miller-Rabin in practice. Interesting theoretically; rarely used in production due to high constant factors. Mention as a footnote if asked about deterministic primality; don’t implement it in interviews.

See also: Compute x^y for Floats and Negative ExponentsCheck if a Number Is a Power of TwoTwo Sum: Find a Pair in an Array

💡Strategies for Solving This Problem

Finding Primes Efficiently

Usually one of these: check if number is prime, find all primes up to n, or find nth prime. Each has different tricks. Got this at Google in 2023.

Approach 1: Check If Prime

Naive: Try dividing by all numbers from 2 to n-1. O(n) - too slow.

Better: Only check up to √n. If n has a factor larger than √n, it must also have one smaller than √n.

Even better: Check 2, then only odd numbers up to √n. O(√n).

Approach 2: Find All Primes Up To N

Sieve of Eratosthenes - the classic algorithm.

  1. Start with all numbers marked as prime
  2. For each prime p, mark all multiples of p as not prime
  3. Continue for all p up to √n

O(n log log n) time, O(n) space. Much faster than checking each number individually.

Why Sieve Works

Every composite number has a prime factor ≤ √n. So if we cross out all multiples of primes up to √n, only primes remain.

Optimizations

Start crossing out from p². All smaller multiples already crossed out by smaller primes.

Only store odd numbers (except 2). Saves half the space.

At Google

They asked "Find all primes up to 1 million." I used Sieve of Eratosthenes. Then asked "How would you optimize for space?" That's when I mentioned segmented sieve for very large ranges.

Also asked about primality testing for very large numbers - mentioned Miller-Rabin probabilistic test.

Scroll to Top