Fair Coin From a Biased Coin: Von Neumann’s Interview-Friendly Trick

⏱ 6 min read

You have a biased coin that comes up heads with probability p and tails with probability 1-p, where p is unknown but constant and not 0 or 1. Using only this coin, can you generate a fair coin flip — that is, an event with probability exactly 1/2?

This is Von Neumann’s puzzle, one of the most beautiful interview questions in the probability canon. John von Neumann published the solution in 1951 in a short note titled “Various techniques used in connection with random digits”. The puzzle has been an interview staple at quant firms ever since, and it remains a routine ask at Jane Street, Citadel, Two Sigma, Optiver, SIG, and most other quant interviewers in 2026.

The trick

Flip the biased coin twice. If the result is HT (heads then tails), output “fair heads”. If the result is TH (tails then heads), output “fair tails”. If the result is HH or TT, discard and flip twice again. Repeat until you get HT or TH.

That is the entire algorithm. Three lines. The reason it works is that HT and TH are equally likely under any biased-coin assumption, even though heads and tails individually are not.

The proof

The probability of HT in two independent flips is P(H) × P(T) = p(1-p). The probability of TH is P(T) × P(H) = (1-p)p. These are exactly equal: p(1-p) = (1-p)p. So conditional on the outcome being HT or TH, each occurs with probability 1/2.

The HH and TT outcomes are not equally likely (they have probabilities p² and (1-p)² respectively, which differ unless p = 1/2), but we discard them. By discarding the asymmetric outcomes and keeping only the symmetric pair, we extract a fair flip from a biased source.

The trick is essentially a rejection sampler that exploits the symmetry of the off-diagonal joint outcomes.

Expected number of flips

For each pair of biased flips, we keep the result with probability 2p(1-p) (HT or TH) and reject with probability p² + (1-p)² = 1 - 2p(1-p). This is a geometric process; the expected number of pairs to one accept is 1 / (2p(1-p)), so the expected number of biased flips per fair flip is:

E[flips per fair coin] = 2 / (2p(1-p)) = 1 / (p(1-p))

For p = 1/2, this is 4 — a fair coin takes 4 expected flips to produce a fair flip via Von Neumann’s method, which is wasteful given the input is already fair. For p = 0.1 or p = 0.9, it is 1 / 0.09 ≈ 11.1 flips per fair flip. As p approaches 0 or 1, the expected number of flips diverges, since most paired outcomes will be HH or TT.

What the question tests

Quant interviewers ask this puzzle for three reasons:

Recognition of symmetry. The HT/TH equivalence is not obvious if you have not seen the trick. A candidate who derives it under interview pressure has demonstrated probabilistic insight.
Independence reasoning. The proof requires recognizing that pair outcomes are products of marginal probabilities, which is the multiplication of independent probabilities. A candidate who fumbles this step is fumbling on a fundamental skill.
Rejection sampling intuition. The trick is the simplest example of a rejection sampler. Understanding it is a stepping stone to MCMC, importance sampling, and many other Monte Carlo techniques used in quant practice.

Variations interviewers ask

“Generate a fair k-sided die from a biased coin.” The standard answer uses Von Neumann pairs to extract bits, then groups bits into binary representations of die outcomes (with rejection for over-large values). Tests composition of techniques.
“Generate a biased coin with probability q from a fair coin.” The standard answer flips fair coins to extract a binary expansion of q and accepts/rejects based on the comparison. Inverse direction; same machinery.
“What if you do not know p?” The Von Neumann method does not require knowing p. This is part of why it is elegant — the algorithm produces a fair flip without ever estimating the underlying bias.
“Can you do better than 1/(p(1-p)) flips per fair flip?” Yes — the optimal entropy extraction approaches 1 / H(p) flips per fair flip, where H(p) is the binary entropy. For p close to 0.5, this is a small improvement; for very biased coins it is enormous. The optimal algorithms (Peres’ method, advanced extractors) are research-grade and rarely required in interviews.
“Generate a uniform random number in [0, 1] from a biased coin.” Generate fair bits via Von Neumann, then construct the binary expansion of a number in [0, 1]. Tests that the candidate sees the bridge from coin flips to continuous distributions.

The Peres improvement

Yuval Peres in 1992 showed that Von Neumann’s method is suboptimal in terms of bits extracted per flip. Peres’ iterative algorithm extracts more fair bits from the same input by recursively processing the discarded HH and TT outcomes (which still contain entropy, just not in the original symmetry). For a biased coin, Peres’ method approaches the information-theoretic limit of H(p) fair bits per flip.

In an interview, mentioning Peres unprompted signals that the candidate has read deeply into the topic. Most interviewers do not require it; a clean Von Neumann derivation is the bar.

What a clean answer looks like

For a quant interview, the polished answer takes about two minutes:

“Flip the coin twice. If the outcome is HT, output ‘heads’. If TH, output ‘tails’. If HH or TT, discard and try again.”
“This works because P(HT) = P(H)P(T) = p(1-p), and P(TH) = P(T)P(H) = (1-p)p. These are equal regardless of p, so conditional on a non-discarded outcome, each is 1/2.”
“The expected number of biased flips per fair flip is 1 / (p(1-p)). This is bounded for any p in (0, 1) but diverges as p approaches 0 or 1.”
“The method does not require knowing p, which is why it is robust. There are improvements like Peres’ algorithm that extract more bits per flip, but Von Neumann’s is the canonical and simplest.”

Is it asked in 2026?

Yes, regularly at quant firms and probability-heavy data science interviews. The Von Neumann trick is well-known but the variations (k-sided die, biased coin from fair coin, optimal extraction) give the interviewer plenty of follow-up material. A candidate who has seen Von Neumann but not Peres will eventually hit a follow-up they cannot answer, which preserves the question’s signal even at quant firms where every candidate has heard of the basic version.

Tech interviews rarely ask this specifically. ML engineering roles occasionally have a related question about sampling from non-uniform distributions, but the framing is usually applied (Gumbel softmax, importance sampling) rather than the pure coin-flip puzzle.

Frequently Asked Questions

What if p = 0.5? Does the trick still work?

Yes, but it wastes flips. For a fair coin, Von Neumann’s method takes 4 expected flips to produce one fair flip, when you could just use the coin directly. The method is unnecessary but not incorrect when p = 0.5.

What if p is unknown?

This is the elegance of the method — it does not require knowing p. The HT/TH symmetry holds for any p in (0, 1), so the algorithm produces a fair flip without estimation.

What is the information-theoretic limit?

For a biased coin with bias p, the entropy is H(p) = -p log p – (1-p) log(1-p) bits per flip. The optimal extractor produces H(p) fair bits per flip on average. Von Neumann’s method achieves about 2p(1-p) bits per flip, which is suboptimal except at p = 0.5. Peres’ algorithm approaches the optimum.

Can you generate a fair coin from a deterministic biased coin?

No. The deterministic case is degenerate: if every flip is heads (or every flip is tails), there is no entropy to extract. Von Neumann’s method requires p strictly between 0 and 1.

Is this the simplest example of a rejection sampler?

Yes, in the discrete case. The structure — propose, accept on a symmetry condition, reject and retry — is the prototype for more general rejection sampling and Monte Carlo techniques. The puzzle is often used as the motivating example in graduate-level probability courses.