Generate rand7 from rand5: Rejection Sampling for Random Sources

Generate Random Numbers in a Range Using a Smaller Random Source

“Given rand5() that returns a uniform integer 1–5, implement rand7() that returns 1–7.” This LeetCode #470 problem is a classic interview question that tests probability intuition and the rejection sampling pattern. The naive approach (combining two rand5 calls and modding) introduces bias; the correct approach uses rejection sampling to preserve uniformity. This guide covers the standard rand_k → rand_n problem family with working code, the rejection-sampling rationale, and the variations.

The Core Problem

Given a function rand5() that returns a uniformly random integer in {1, 2, 3, 4, 5}, implement rand7() that returns a uniformly random integer in {1, 2, 3, 4, 5, 6, 7}. You may not use any random functions other than rand5.

Approach 1: Combine Two rand5 Calls

Two rand5 calls can produce 5 × 5 = 25 distinct outcomes. Map them to a uniform 1..25 distribution, then carve out the 1..7 range with rejection.

def rand7(rand5) -> int:
    """Generate uniform 1..7 using rand5. Average ~2.38 calls to rand5."""
    while True:
        x = (rand5() - 1) * 5 + rand5()  # uniform 1..25
        if x <= 21:
            return ((x - 1) % 7) + 1


# Demo with a real rand5
import random

def rand5_real():
    return random.randint(1, 5)

# Test distribution
counts = [0] * 8  # 0..7, ignoring index 0
for _ in range(70000):
    counts[rand7(rand5_real)] += 1
print(counts[1:])  # roughly [10000] * 7

Why this works: The expression (rand5() - 1) * 5 + rand5() uses the first call as the “tens digit” (in base 5) and the second call as the “ones digit,” producing all integers 1..25 with equal probability.

Why retry on x > 21: 25 outcomes don’t divide evenly by 7. We need 21 = 3 × 7 outcomes for uniform mapping. The 4 outcomes 22..25 are rejected (retry). 21 outcomes / 7 buckets = 3 each, giving uniform 1..7.

Expected calls: Probability of acceptance = 21/25 = 0.84. Expected calls = 2 × (25/21) ≈ 2.38 to rand5 per call to rand7.

Why Naive Approaches Fail

Naive 1: rand5() + rand5() (sums to 2..10)

The sum is NOT uniform. P(sum = 2) = 1/25; P(sum = 6) = 5/25. Distribution is triangular, not uniform.

Naive 2: rand5() % 7

rand5() returns 1..5. Mod 7 gives 1..5 (each with probability 1/5). Doesn’t cover 6 and 7 at all.

Naive 3: (rand5() * 2 – 1)

Gives 1, 3, 5, 7, 9 each with probability 1/5. Doesn’t cover 2, 4, 6 at all.

The correct approach must produce all 7 outputs with equal probability. Only by combining multiple rand5 calls and using rejection sampling can you achieve true uniformity.

Variant: rand5 from rand7 (Easier Direction)

The reverse problem is much simpler:

def rand5_from_rand7(rand7) -> int:
    """Generate uniform 1..5 from rand7. Just retry on 6, 7."""
    while True:
        x = rand7()
        if x <= 5:
            return x

Expected calls: 7/5 = 1.4. Going from a larger to a smaller random source is always easier (just retry on out-of-range outcomes); going from smaller to larger requires combining calls.

Variant: Generate Uniform 1..N from rand_k

Generalize the technique: combine multiple rand_k calls until you have ≥ N possible outcomes; reject any that exceed the largest multiple of N within the combined range.

def rand_n_from_rand_k(rand_k, k: int, n: int) -> int:
    """Generate uniform 1..n using rand_k."""
    # Find the smallest m such that k^m >= n
    m = 0
    range_size = 1
    while range_size < n:
        m += 1
        range_size *= k
    # range_size is k^m
    accept_threshold = (range_size // n) * n  # largest multiple of n <= range_size

    while True:
        # Combine m calls to rand_k, base k
        x = 0
        for _ in range(m):
            x = x * k + (rand_k() - 1)
        x += 1  # convert 0..k^m - 1 to 1..k^m
        if x <= accept_threshold:
            return ((x - 1) % n) + 1

This generalizes to any k → n, including non-prime / non-power cases.

Variant: Equal-Probability Selection from a List

Given a list of N items and uniform random source, select one item uniformly. Trivially: rand_n() returning index. Demonstrates the abstract mapping.

Variant: Random Without Bias on a Biased Coin (Von Neumann)

Given a coin landing heads with probability p (unknown), simulate a fair coin. Pair consecutive flips; accept HT or TH (mapping HT → heads, TH → tails). Reject HH and TT. Pairs are equally likely, regardless of p, so the result is unbiased.

def fair_from_biased(biased_coin) -> bool:
    """Return True with probability 0.5 from biased coin."""
    while True:
        first = biased_coin()
        second = biased_coin()
        if first != second:
            return first

Common Pitfalls

  • Modular bias. Direct mod after combining always introduces bias unless you reject the leftover range first.
  • Forgetting the rejection loop. Without rejection, “wrap around” introduces bias. The loop ensures uniformity.
  • Inefficient combinations. Using more calls than necessary increases expected runtime. Two calls of rand5 give 25 outcomes (sufficient for rand7); three would give 125 (overkill).
  • Confusing 0-indexed vs 1-indexed conventions. Some rand_n functions return 0..n-1, others 1..n. Adjust the offset arithmetic carefully.
  • Missing the rejection efficiency analysis. Strong candidates state expected calls explicitly. The geometric-distribution math is acceptance probability = (largest multiple) / (total), expected calls = 1 / acceptance.

Frequently Asked Questions

What’s the expected interview answer for rand7 from rand5?

Combine two rand5 calls to get uniform 1..25. Reject 22..25 (retry). Map 1..21 to 1..7 via mod. Expected 2.38 calls. Walk through why the combination is uniform (base-5 digits) and why the rejection preserves uniformity (only multiples of 7 are kept).

Can I do this without a retry loop?

Not exactly. There’s no deterministic mapping from a finite combination of rand5 to a uniform rand7 because no power of 5 is a multiple of 7. Any “always accept” scheme introduces bias. Rejection sampling is the canonical correct approach. There are accept-on-first-try variants that succeed with probability ≤ 1, but the expected outcome is still some number of calls — equivalent to rejection.

How efficient is the rand7 from rand5 implementation?

Expected 2 × (25/21) ≈ 2.38 calls per output. The probability of acceptance (21/25) doesn’t depend on input bias; the algorithm is provably correct and efficient. For applications requiring low-latency randomness (cryptographic key generation, simulations), this constant overhead is acceptable.

What if I need rand_n where n is very large compared to k?

Combine many rand_k calls. For rand7 from rand2, you’d need 3 calls of rand2 (8 outcomes; reject 1; uniform 1..7 from 7 outcomes; expected ~3 × 8/7 ≈ 3.43 calls). For larger N, the number of combination calls grows logarithmically.

Why is generating bigger random from smaller harder than the reverse?

Information content. rand5 has log₂(5) ≈ 2.32 bits of entropy per call; rand7 needs log₂(7) ≈ 2.81 bits. To produce 2.81 bits, you need at least one call of rand5 (insufficient) or two calls (4.64 bits, more than enough). The “more than enough” excess is the source of the rejection — you have spare entropy you discard. Going from larger to smaller, you have more entropy than needed; just discard.

See also: Return True with Probability p/qPower of Two CheckProbability Distribution Function

💡Strategies for Solving This Problem

Rejection Sampling

Classic random number generation problem. Got this at Google. It's about using one random source to create another with different properties.

The Problem

Given rand5() which returns random int 0-4 uniformly, implement rand7() which returns random int 0-6 uniformly.

Why It's Tricky

Can't just do rand5() + rand5() or rand5() % 7 - these don't give uniform distribution.

Key Insight: Generate Larger Range

Call rand5() twice to get random number 0-24:

result = rand5() * 5 + rand5()

This gives 25 equally-likely outcomes. Since 25 = 3×7 + 4, use first 21 outcomes for rand7(), reject 22-24.

Rejection Sampling

  1. Generate candidate from larger space
  2. If in valid range, return it
  3. Otherwise, try again

Ensures uniform distribution because we only accept equally-likely outcomes.

Expected Calls

Probability of acceptance: 21/25 = 0.84

Expected calls to rand5(): 2 / 0.84 ≈ 2.38

General Pattern

To implement randN() using randM():

  1. Find k such that M^k >= N
  2. Generate number 0 to M^k - 1
  3. Take largest multiple of N that fits
  4. Reject and retry if outside
Scroll to Top