Return True with Probability p/q: Sampling Without Bias

Q: What's the expected interview answer for "rand7 from rand5"?

Compute (rand5() - 1) * 5 + rand5() to get uniform 1..25. Accept if ≤ 21; map to 1..7 via mod. Otherwise retry. Expected ~2.38 calls per output. Walk through the uniformity reasoning: 25 outcomes / 7 buckets = 3 each + 4 leftover that we discard.

⏱ 5 min read

Return True with Probability p/q: Sampling Without Bias

“Given a function that returns a uniform random value, write a function that returns True with probability p/q.” This is a foundational probability / programming interview question that tests your understanding of uniform random sampling and your ability to convert between random sources. Variants include “given rand10, generate rand7,” “given a biased coin, simulate a fair coin,” and “implement weighted random selection.” This guide covers the core technique, the standard variants with working code, and the common pitfalls.

Problem Statement

Given a function random() that returns a uniform float in [0, 1), implement a function that returns True with probability p/q and False with probability 1 - p/q.

Examples:

p = 1, q = 2 → fair coin (50/50)
p = 1, q = 4 → True 25% of the time
p = 3, q = 7 → True ~42.86% of the time

Approach 1: Direct Comparison

If you have random() returning a uniform value in [0, 1), comparing against the threshold p/q works directly.

import random

def biased_bool(p: int, q: int) -> bool:
    """Return True with probability p/q."""
    return random.random() < p / q


# Test (run many times to verify distribution)
true_count = sum(biased_bool(3, 7) for _ in range(100_000))
print(true_count / 100_000)  # roughly 0.4286

Complexity: O(1) per call. The standard answer when uniform random() is available.

Variant 1: Generate rand_k from rand_n

Given rand7() (returns 1–7 uniformly), implement rand10(). Or vice versa.

Easy direction: smaller from larger

Given rand7 → rand5: just call rand7, retry if > 5.

def rand5_from_rand7(rand7) -> int:
    """Generate uniform 1..5 from rand7."""
    while True:
        x = rand7()  # 1..7
        if x <= 5:
            return x

Expected calls: 7/5 = 1.4 (from geometric distribution). Some calls “waste” 6 and 7 outcomes; the retry loop ensures uniformity.

Hard direction: larger from smaller

Given rand5 → rand7: combine multiple rand5 calls. The standard trick is to compute (rand5() - 1) * 5 + rand5(), which gives a uniform 1..25. Then map 1..21 to 1..7 (each value of rand7 corresponds to 3 values), and retry if the result is 22..25.

def rand7_from_rand5(rand5) -> int:
    """Generate uniform 1..7 from rand5."""
    while True:
        x = (rand5() - 1) * 5 + rand5()  # uniform 1..25
        if x <= 21:
            return ((x - 1) % 7) + 1

Expected calls: 2 × (25/21) ≈ 2.38 calls to rand5 per call to rand7.

Variant 2: Fair Coin from Biased Coin (Von Neumann)

Given a biased coin that lands heads with probability p (unknown), produce a fair 50/50 outcome.

def fair_from_biased(biased) -> bool:
    """Von Neumann's trick: pair flips, accept HT or TH."""
    while True:
        first = biased()
        second = biased()
        if first != second:
            return first  # H if first=T,second=H else T

Why this works: Pairs (H,T) and (T,H) each have probability p(1-p) — equal. Pairs (H,H) and (T,T) get rejected (retry). Result: 50/50 fair coin.

Expected calls: 2 / (2p(1-p)). For p = 0.5 (already fair), expected 4 calls. For p near 0 or 1, much higher.

Variant 3: Weighted Random Selection

Given an array of weights, return index i with probability weights[i] / sum(weights). (LeetCode #528)

import bisect

class Solution:
    def __init__(self, weights: list[int]):
        # Cumulative weights
        self.total = sum(weights)
        self.cumulative = []
        running = 0
        for w in weights:
            running += w
            self.cumulative.append(running)

    def pick_index(self) -> int:
        """O(log n) per call."""
        target = random.random() * self.total
        return bisect.bisect_left(self.cumulative, target)


# Test
sol = Solution([1, 3, 2])  # weights
counts = [0, 0, 0]
for _ in range(100_000):
    counts[sol.pick_index()] += 1
print(counts)  # roughly [16667, 50000, 33333]

Complexity: O(n) preprocessing, O(log n) per call. The standard answer for weighted selection.

Variant 4: Reservoir Sampling

Given a stream of unknown length, select k random elements uniformly. Standard technique: keep first k elements; for each subsequent element at index i, replace one of the k with probability k/i.

import random

def reservoir_sample(stream, k: int) -> list:
    """Sample k elements uniformly from stream of unknown length."""
    reservoir = []
    for i, item in enumerate(stream):
        if i < k:
            reservoir.append(item)
        else:
            j = random.randint(0, i)
            if j < k:
                reservoir[j] = item
    return reservoir

Useful when the stream is too large to fit in memory or unknown in length. Each element ends up with probability k/n where n is the (unknown) total length.

Common Mistakes

Bias from non-uniform mapping. Mapping rand7 to {0, 1} via mod 2 gives 4/7 zeros and 3/7 ones — biased. Always retry on out-of-range outcomes to maintain uniformity.
Forgetting the retry loop. Direct mapping that “wraps around” introduces bias. The retry loop preserves uniformity at the cost of extra calls.
Confusing rand_n() = “1 to n” with “0 to n-1”. Conventions vary. Clarify with the interviewer; adjust formulas accordingly.
Bad expected-call analysis. “Some calls retry” doesn’t mean infinite loop; the retry probability is bounded, so expected calls are finite. Compute via geometric distribution: expected calls = 1 / acceptance probability.
Using floor/ceiling incorrectly. When mapping a uniform [0, 1) value to discrete outcomes, off-by-one errors can shift the distribution. int(random() * n) gives 0..n-1 uniformly; int(random() * n) + 1 gives 1..n uniformly.

Frequently Asked Questions

What’s the expected interview answer for “rand7 from rand5”?

Compute (rand5() - 1) * 5 + rand5() to get uniform 1..25. Accept if ≤ 21; map to 1..7 via mod. Otherwise retry. Expected ~2.38 calls per output. Walk through the uniformity reasoning: 25 outcomes / 7 buckets = 3 each + 4 leftover that we discard.

Why not just (rand5() – 1) % 7 + 1 to map?

That gives biased output. rand5 returns values 1..5, so (rand5() – 1) is 0..4. Mod 7 doesn’t wrap to cover all 7 outputs. Even if you use rand7, mod operations on partial-range inputs are biased. Always use the “scale up, accept-or-retry” pattern.

How does Von Neumann’s fair-coin trick work?

Pair consecutive flips. The pair “HT” has probability p(1-p); the pair “TH” has probability (1-p)p. They’re equal, so accepting one or the other based on which comes first gives 50/50. Pairs “HH” and “TT” get rejected (retry). The bias of the coin doesn’t matter because the trick exploits the symmetry of pair probabilities.

What’s the time complexity of weighted random selection?

O(n) preprocessing to compute cumulative weights, O(log n) per query via binary search. For a small number of weights or rare changes to weights, this is optimal. For frequently-changing weights, consider segment trees or alias methods (constant time per query at the cost of O(n) reconstruction).

Why is reservoir sampling useful?

It samples k elements from a stream without knowing the total length. Memory: O(k). Critical for large datasets that don’t fit in memory or for online algorithms where the stream is infinite. Used in many production data pipelines (log sampling, A/B test cohort selection).

💡Strategies for Solving This Problem

Continued Fractions and Approximation

This appeared at a quant firm interview. It's about approximating decimals as fractions with tolerance. Tests understanding of numerical methods and algorithmic precision.

The Problem

Convert decimal to fraction: 0.25 → 1/4. With tolerance: 0.24 with tolerance 0.01 still gives 1/4 (since |0.25 - 0.24| < 0.01).

Naive Approach: Brute Force

Try all denominators from 1 to some limit. For each, find best numerator. Check if within tolerance.

O(n²) where n is denominator limit. Works but slow.

Better: Continued Fractions

Use Euclidean algorithm to find best rational approximation. Produces sequence of "convergents" that are optimal approximations.

For π = 3.14159...:

3/1
22/7 (error 0.0013)
333/106
355/113 (error 0.000000027!)

Each convergent is the best approximation for its denominator size.

Algorithm Steps

Start with value v
Integer part: floor(v)
Fractional part: v - floor(v)
Invert fractional part: 1 / frac
Repeat until frac < tolerance or denominator too large

Key Insight

Each step of Euclidean algorithm gives next convergent. These are provably optimal approximations.

✅Solution

Solution: Continued Fractions

function decimalToFraction(decimal, tolerance = 0.01) {
    if (Math.abs(decimal - Math.round(decimal)) < tolerance) {
        return { numerator: Math.round(decimal), denominator: 1 };
    }

    const sign = decimal < 0 ? -1 : 1;
    decimal = Math.abs(decimal);

    // Use continued fractions
    let num0 = 0, num1 = 1;
    let den0 = 1, den1 = 0;
    let value = decimal;

    const maxIterations = 20;

    for (let i = 0; i < maxIterations; i++) {
        const integer = Math.floor(value);

        // Update numerator and denominator
        const numNew = integer * num1 + num0;
        const denNew = integer * den1 + den0;

        // Check if approximation is good enough
        const approximation = numNew / denNew;
        if (Math.abs(approximation - decimal) < tolerance) {
            return {
                numerator: sign * numNew,
                denominator: denNew
            };
        }

        // Prepare for next iteration
        num0 = num1;
        num1 = numNew;
        den0 = den1;
        den1 = denNew;

        // Get fractional part and invert
        const fractional = value - integer;
        if (fractional < 1e-10) break;  // Close enough to integer

        value = 1 / fractional;
    }

    return {
        numerator: sign * num1,
        denominator: den1
    };
}

// Tests
console.log(decimalToFraction(0.25, 0.01));      // 1/4
console.log(decimalToFraction(0.24, 0.01));      // 1/4
console.log(decimalToFraction(0.333, 0.01));     // 1/3
console.log(decimalToFraction(0.666, 0.01));     // 2/3
console.log(decimalToFraction(3.14159, 0.001)); // 22/7 or 355/113
console.log(decimalToFraction(-0.5, 0.01));     // -1/2

Alternative: Brute Force (Simpler)

function decimalToFractionBruteForce(decimal, tolerance = 0.01, maxDenom = 1000) {
    const sign = decimal < 0 ? -1 : 1;
    decimal = Math.abs(decimal);

    let bestNum = 0, bestDen = 1;
    let bestError = Math.abs(decimal);

    for (let den = 1; den <= maxDenom; den++) {
        const num = Math.round(decimal * den);
        const approximation = num / den;
        const error = Math.abs(approximation - decimal);

        if (error < bestError) {
            bestError = error;
            bestNum = num;
            bestDen = den;

            if (error < tolerance) {
                break;  // Good enough
            }
        }
    }

    return {
        numerator: sign * bestNum,
        denominator: bestDen
    };
}

console.log(decimalToFractionBruteForce(0.25, 0.01));  // 1/4

With Simplification (GCD)

function gcd(a, b) {
    a = Math.abs(a);
    b = Math.abs(b);

    while (b !== 0) {
        [a, b] = [b, a % b];
    }

    return a;
}

function simplifyFraction(num, den) {
    const divisor = gcd(num, den);
    return {
        numerator: num / divisor,
        denominator: den / divisor
    };
}

function decimalToFractionSimplified(decimal, tolerance = 0.01) {
    const result = decimalToFraction(decimal, tolerance);
    return simplifyFraction(result.numerator, result.denominator);
}

console.log(decimalToFractionSimplified(0.5, 0.01));    // 1/2
console.log(decimalToFractionSimplified(0.66, 0.01));   // 2/3
console.log(decimalToFractionSimplified(0.125, 0.01));  // 1/8

Step-by-Step Example: 0.75

decimal = 0.75, tolerance = 0.01

Iteration 0:
  integer = floor(0.75) = 0
  numNew = 0 * 1 + 0 = 0
  denNew = 0 * 0 + 1 = 1
  approx = 0/1 = 0
  error = 0.75 > 0.01, continue
  fractional = 0.75
  value = 1/0.75 = 1.333...

Iteration 1:
  integer = floor(1.333) = 1
  numNew = 1 * 1 + 0 = 1
  denNew = 1 * 0 + 1 = 1
  approx = 1/1 = 1
  error = 0.25 > 0.01, continue
  fractional = 0.333
  value = 1/0.333 = 3

Iteration 2:
  integer = floor(3) = 3
  numNew = 3 * 1 + 1 = 4
  denNew = 3 * 1 + 0 = 3
  error (actually) = 1 * 1 / 3 = 1/3 (need recalc)

Actually simpler: 3/4 = 0.75 exactly!
Result: 3/4

Handling Edge Cases

function decimalToFractionRobust(decimal, tolerance = 0.01) {
    // Handle special cases
    if (!isFinite(decimal)) {
        throw new Error('Invalid decimal');
    }

    if (decimal === 0) {
        return { numerator: 0, denominator: 1 };
    }

    const result = decimalToFraction(decimal, tolerance);

    // Simplify
    const divisor = gcd(result.numerator, result.denominator);

    return {
        numerator: result.numerator / divisor,
        denominator: result.denominator / divisor
    };
}

Complexity

Continued fractions: O(log n) iterations where n is denominator
Brute force: O(n) where n is max denominator
Space: O(1) for both

Why Continued Fractions Are Optimal

Each convergent is the best rational approximation with that denominator size. No other fraction with smaller denominator can be closer.

Example: For π, 22/7 is the best approximation with denominator ≤ 7.

Common Mistakes

Not simplifying: Returning 2/8 instead of 1/4
Tolerance misunderstanding: Checking absolute vs relative error
Infinite loops: Not handling when fractional part is tiny
Overflow: Denominators growing too large

Follow-Up Questions

Q: How to find all fractions within tolerance?
A: Stern-Brocot tree or Farey sequence

Q: What about irrational numbers?
A: Continued fractions never terminate, keep best approximation

Q: How to format as mixed number?
A: Extract integer part, keep fractional part as fraction

Return True with Probability p/q: Sampling Without Bias

Problem Statement

Approach 1: Direct Comparison

Variant 1: Generate rand_k from rand_n

Easy direction: smaller from larger

Hard direction: larger from smaller

Variant 2: Fair Coin from Biased Coin (Von Neumann)

Variant 3: Weighted Random Selection

Variant 4: Reservoir Sampling

Common Mistakes

Frequently Asked Questions

What’s the expected interview answer for “rand7 from rand5”?

Why not just (rand5() – 1) % 7 + 1 to map?

How does Von Neumann’s fair-coin trick work?

What’s the time complexity of weighted random selection?

Why is reservoir sampling useful?

💡Strategies for Solving This Problem

Continued Fractions and Approximation

The Problem

Naive Approach: Brute Force

Better: Continued Fractions

Algorithm Steps

Key Insight

✅Solution

Solution: Continued Fractions

Alternative: Brute Force (Simpler)

With Simplification (GCD)

Step-by-Step Example: 0.75

Handling Edge Cases

Complexity

Why Continued Fractions Are Optimal

Common Mistakes

Follow-Up Questions

Related Problems