Generate rand7 from rand5: Rejection Sampling for Random Sources

⏱ 5 min read

Generate Random Numbers in a Range Using a Smaller Random Source

“Given rand5() that returns a uniform integer 1–5, implement rand7() that returns 1–7.” This LeetCode #470 problem is a classic interview question that tests probability intuition and the rejection sampling pattern. The naive approach (combining two rand5 calls and modding) introduces bias; the correct approach uses rejection sampling to preserve uniformity. This guide covers the standard rand_k → rand_n problem family with working code, the rejection-sampling rationale, and the variations.

The Core Problem

Given a function rand5() that returns a uniformly random integer in {1, 2, 3, 4, 5}, implement rand7() that returns a uniformly random integer in {1, 2, 3, 4, 5, 6, 7}. You may not use any random functions other than rand5.

Approach 1: Combine Two rand5 Calls

Two rand5 calls can produce 5 × 5 = 25 distinct outcomes. Map them to a uniform 1..25 distribution, then carve out the 1..7 range with rejection.

def rand7(rand5) -> int:
    """Generate uniform 1..7 using rand5. Average ~2.38 calls to rand5."""
    while True:
        x = (rand5() - 1) * 5 + rand5()  # uniform 1..25
        if x <= 21:
            return ((x - 1) % 7) + 1


# Demo with a real rand5
import random

def rand5_real():
    return random.randint(1, 5)

# Test distribution
counts = [0] * 8  # 0..7, ignoring index 0
for _ in range(70000):
    counts[rand7(rand5_real)] += 1
print(counts[1:])  # roughly [10000] * 7

Why this works: The expression (rand5() - 1) * 5 + rand5() uses the first call as the “tens digit” (in base 5) and the second call as the “ones digit,” producing all integers 1..25 with equal probability.

Why retry on x > 21: 25 outcomes don’t divide evenly by 7. We need 21 = 3 × 7 outcomes for uniform mapping. The 4 outcomes 22..25 are rejected (retry). 21 outcomes / 7 buckets = 3 each, giving uniform 1..7.

Expected calls: Probability of acceptance = 21/25 = 0.84. Expected calls = 2 × (25/21) ≈ 2.38 to rand5 per call to rand7.

Why Naive Approaches Fail

Naive 1: rand5() + rand5() (sums to 2..10)

The sum is NOT uniform. P(sum = 2) = 1/25; P(sum = 6) = 5/25. Distribution is triangular, not uniform.

Naive 2: rand5() % 7

rand5() returns 1..5. Mod 7 gives 1..5 (each with probability 1/5). Doesn’t cover 6 and 7 at all.

Naive 3: (rand5() * 2 – 1)

Gives 1, 3, 5, 7, 9 each with probability 1/5. Doesn’t cover 2, 4, 6 at all.

The correct approach must produce all 7 outputs with equal probability. Only by combining multiple rand5 calls and using rejection sampling can you achieve true uniformity.

Variant: rand5 from rand7 (Easier Direction)

The reverse problem is much simpler:

def rand5_from_rand7(rand7) -> int:
    """Generate uniform 1..5 from rand7. Just retry on 6, 7."""
    while True:
        x = rand7()
        if x <= 5:
            return x

Expected calls: 7/5 = 1.4. Going from a larger to a smaller random source is always easier (just retry on out-of-range outcomes); going from smaller to larger requires combining calls.

Variant: Generate Uniform 1..N from rand_k

Generalize the technique: combine multiple rand_k calls until you have ≥ N possible outcomes; reject any that exceed the largest multiple of N within the combined range.

def rand_n_from_rand_k(rand_k, k: int, n: int) -> int:
    """Generate uniform 1..n using rand_k."""
    # Find the smallest m such that k^m >= n
    m = 0
    range_size = 1
    while range_size < n:
        m += 1
        range_size *= k
    # range_size is k^m
    accept_threshold = (range_size // n) * n  # largest multiple of n <= range_size

    while True:
        # Combine m calls to rand_k, base k
        x = 0
        for _ in range(m):
            x = x * k + (rand_k() - 1)
        x += 1  # convert 0..k^m - 1 to 1..k^m
        if x <= accept_threshold:
            return ((x - 1) % n) + 1

This generalizes to any k → n, including non-prime / non-power cases.

Variant: Equal-Probability Selection from a List

Given a list of N items and uniform random source, select one item uniformly. Trivially: rand_n() returning index. Demonstrates the abstract mapping.

Variant: Random Without Bias on a Biased Coin (Von Neumann)

Given a coin landing heads with probability p (unknown), simulate a fair coin. Pair consecutive flips; accept HT or TH (mapping HT → heads, TH → tails). Reject HH and TT. Pairs are equally likely, regardless of p, so the result is unbiased.

def fair_from_biased(biased_coin) -> bool:
    """Return True with probability 0.5 from biased coin."""
    while True:
        first = biased_coin()
        second = biased_coin()
        if first != second:
            return first

Common Pitfalls

Modular bias. Direct mod after combining always introduces bias unless you reject the leftover range first.
Forgetting the rejection loop. Without rejection, “wrap around” introduces bias. The loop ensures uniformity.
Inefficient combinations. Using more calls than necessary increases expected runtime. Two calls of rand5 give 25 outcomes (sufficient for rand7); three would give 125 (overkill).
Confusing 0-indexed vs 1-indexed conventions. Some rand_n functions return 0..n-1, others 1..n. Adjust the offset arithmetic carefully.
Missing the rejection efficiency analysis. Strong candidates state expected calls explicitly. The geometric-distribution math is acceptance probability = (largest multiple) / (total), expected calls = 1 / acceptance.

Frequently Asked Questions

What’s the expected interview answer for rand7 from rand5?

Combine two rand5 calls to get uniform 1..25. Reject 22..25 (retry). Map 1..21 to 1..7 via mod. Expected 2.38 calls. Walk through why the combination is uniform (base-5 digits) and why the rejection preserves uniformity (only multiples of 7 are kept).

Can I do this without a retry loop?

Not exactly. There’s no deterministic mapping from a finite combination of rand5 to a uniform rand7 because no power of 5 is a multiple of 7. Any “always accept” scheme introduces bias. Rejection sampling is the canonical correct approach. There are accept-on-first-try variants that succeed with probability ≤ 1, but the expected outcome is still some number of calls — equivalent to rejection.

How efficient is the rand7 from rand5 implementation?

Expected 2 × (25/21) ≈ 2.38 calls per output. The probability of acceptance (21/25) doesn’t depend on input bias; the algorithm is provably correct and efficient. For applications requiring low-latency randomness (cryptographic key generation, simulations), this constant overhead is acceptable.

What if I need rand_n where n is very large compared to k?

Combine many rand_k calls. For rand7 from rand2, you’d need 3 calls of rand2 (8 outcomes; reject 1; uniform 1..7 from 7 outcomes; expected ~3 × 8/7 ≈ 3.43 calls). For larger N, the number of combination calls grows logarithmically.

Why is generating bigger random from smaller harder than the reverse?

Information content. rand5 has log₂(5) ≈ 2.32 bits of entropy per call; rand7 needs log₂(7) ≈ 2.81 bits. To produce 2.81 bits, you need at least one call of rand5 (insufficient) or two calls (4.64 bits, more than enough). The “more than enough” excess is the source of the rejection — you have spare entropy you discard. Going from larger to smaller, you have more entropy than needed; just discard.

💡Strategies for Solving This Problem

Rejection Sampling

Classic random number generation problem. Got this at Google. It's about using one random source to create another with different properties.

The Problem

Given rand5() which returns random int 0-4 uniformly, implement rand7() which returns random int 0-6 uniformly.

Why It's Tricky

Can't just do rand5() + rand5() or rand5() % 7 - these don't give uniform distribution.

Key Insight: Generate Larger Range

Call rand5() twice to get random number 0-24:

result = rand5() * 5 + rand5()

This gives 25 equally-likely outcomes. Since 25 = 3×7 + 4, use first 21 outcomes for rand7(), reject 22-24.

Rejection Sampling

Generate candidate from larger space
If in valid range, return it
Otherwise, try again

Ensures uniform distribution because we only accept equally-likely outcomes.

Expected Calls

Probability of acceptance: 21/25 = 0.84

Expected calls to rand5(): 2 / 0.84 ≈ 2.38

General Pattern

To implement randN() using randM():

Find k such that M^k >= N
Generate number 0 to M^k - 1
Take largest multiple of N that fits
Reject and retry if outside

✅Solution

Solution: Rejection Sampling

function rand5() {
    // Returns random integer 0-4 uniformly
    return Math.floor(Math.random() * 5);
}

function rand7() {
    // Generate number 0-24 using two calls to rand5()
    let num;

    do {
        num = rand5() * 5 + rand5();  // 0-24
    } while (num >= 21);  // Reject 21-24

    // Map 0-20 to 0-6
    return num % 7;
}

// Test distribution
function testDistribution() {
    const counts = Array(7).fill(0);
    const trials = 70000;

    for (let i = 0; i < trials; i++) {
        counts[rand7()]++;
    }

    console.log("Distribution test (should be ~10000 each):");
    counts.forEach((count, i) => {
        console.log(`  ${i}: ${count} (${(count/trials*100).toFixed(1)}%)`);
    });
}

testDistribution();

General Solution: randN from randM

function makeRandN(randM, M, N) {
    return function randN() {
        // Find how many calls we need
        let k = 1;
        let range = M;
        while (range < N) {
            k++;
            range *= M;
        }

        // Find largest multiple of N that fits in range
        const maxValid = Math.floor(range / N) * N;

        let num;
        do {
            num = 0;
            for (let i = 0; i < k; i++) {
                num = num * M + randM();
            }
        } while (num >= maxValid);

        return num % N;
    };
}

// Create rand7 from rand5
const rand7FromMake = makeRandN(rand5, 5, 7);
console.log(rand7FromMake());  // Random 0-6

Reverse: rand5 from rand7

function rand7() {
    return Math.floor(Math.random() * 7);
}

function rand5FromRand7() {
    let num;

    do {
        num = rand7();
    } while (num >= 5);  // Reject 5, 6

    return num;
}

// Expected calls: 7/5 = 1.4

More Efficient: Reuse Rejected Values

function rand7Efficient() {
    // Use remainder from rejection
    let num = rand5() * 5 + rand5();  // 0-24

    if (num < 21) {
        return num % 7;
    }

    // num is 21-24 (4 values)
    // Use as start for next attempt
    num = (num - 21) * 5 + rand5();  // 0-19

    if (num < 14) {
        return num % 7;
    }

    // num is 14-19 (6 values)
    num = (num - 14) * 5 + rand5();  // 0-29

    if (num < 28) {
        return num % 7;
    }

    // Very rare, just retry
    return rand7Efficient();
}

// Expected calls: slightly better than 2.38

Step-by-Step Example

Call 1: rand5() = 3
Call 2: rand5() = 4
num = 3 * 5 + 4 = 19

Is 19 < 21? Yes
Return 19 % 7 = 5 ✓

Another example:
Call 1: rand5() = 4
Call 2: rand5() = 2
num = 4 * 5 + 2 = 22

Is 22 < 21? No, retry

Call 3: rand5() = 1
Call 4: rand5() = 0
num = 1 * 5 + 0 = 5

Is 5 < 21? Yes
Return 5 % 7 = 5 ✓

Why Uniform?

Each of 0-20 appears exactly 3 times in 0-24:

0: appears at 0
1: appears at 1
...
6: appears at 6
0: appears at 7
...
6: appears at 13
0: appears at 14
...
6: appears at 20

Values 0-6 each appear 3 times in 0-20.
After modulo 7, each 0-6 equally likely.

Complexity Analysis

Average calls to rand5(): 2 / (21/25) ≈ 2.38
Worst case: Unbounded (theoretically infinite, practically rare)
Space: O(1)

Common Mistakes

rand5() + rand5(): Not uniform (more likely to get middle values)
rand5() % 7: Not uniform (0-4 more likely than 5-6)
Not rejecting: Must reject to ensure uniformity
Wrong rejection range: Must use largest multiple that fits

Probability Distribution Comparison


// WRONG: rand5() + rand5()
// 0 appears only when both return 0 (1/25 chance)
// 4 appears 5 ways: (0,4), (1,3), (2,2), (3,1), (4,0) (5/25 chance)
// NOT UNIFORM!

// CORRECT: rejection sampling
// Each of 0-6 has exactly 3/21 chance
// UNIFORM!

Follow-Up Questions

Q: Generate random float [0, 1) from rand5()?
A: Use multiple calls: rand5()/5 + rand5()/25 + rand5()/125...

Q: Generate rand49() from rand7()?
A: rand7() * 7 + rand7() gives 0-48, no rejection needed!

Q: What if rand5() is expensive?
A: Use efficient version that reuses rejected values

Generate Random Numbers in a Range Using a Smaller Random Source

The Core Problem

Approach 1: Combine Two rand5 Calls

Why Naive Approaches Fail

Naive 1: rand5() + rand5() (sums to 2..10)

Naive 2: rand5() % 7

Naive 3: (rand5() * 2 – 1)

Variant: rand5 from rand7 (Easier Direction)

Variant: Generate Uniform 1..N from rand_k

Variant: Equal-Probability Selection from a List

Variant: Random Without Bias on a Biased Coin (Von Neumann)

Common Pitfalls

Frequently Asked Questions

What’s the expected interview answer for rand7 from rand5?

Can I do this without a retry loop?

How efficient is the rand7 from rand5 implementation?

What if I need rand_n where n is very large compared to k?

Why is generating bigger random from smaller harder than the reverse?

💡Strategies for Solving This Problem

Rejection Sampling

The Problem

Why It's Tricky

Key Insight: Generate Larger Range

Rejection Sampling

Expected Calls

General Pattern

✅Solution

Solution: Rejection Sampling

General Solution: randN from randM

Reverse: rand5 from rand7

More Efficient: Reuse Rejected Values

Step-by-Step Example

Why Uniform?

Complexity Analysis

Common Mistakes

Probability Distribution Comparison

Follow-Up Questions

Related Problems