Boys and Girls

In a country in which people only want boys, every family continues to have children until they have a boy. if they have a girl, they have another child. if they have a boy, they stop. what is the proportion of boys to girls in the country?


Solution

Pretty simple. Half the couples have boys first, and stop. The rest have a girl. Of those, half have a boy second, and so on.

So suppose there are N couples. There will be N boys. There will be an “infinite” sum of girls equal to N/2 + N/4 + N/8 + … As any college math student knows, this sum adds up to N. Therefore, the proportion of boys to girls will be pretty close to 1:1, with perhaps a few more boys than girls because the sum isn’t actually infinite.

Solved by Paul Brinkley

2026 Update: Sampling Bias — The ML Engineer’s Version

The Boys and Girls puzzle illustrates that stopping rules affect what you observe, not the underlying probability. This is the Simpson’s Paradox and survivorship bias connection that every ML engineer must internalize — because training data is almost always collected with an implicit stopping rule.

The answer: 50% girls. The stopping rule (stop after first boy) changes family sizes but not the birth probability. At the population level, every birth is still 50/50 regardless of family history.

The ML parallel that interviewers care about in 2026:

  • Survivorship bias: If you only train on completed orders (not abandoned carts), you systematically underestimate cancellation probability
  • Length bias: If you sample active users (not churned ones), your model overestimates engagement
  • Feedback loop bias: Your recommender model’s predictions determine what users see, which determines the next training batch — the stopping rule is the model itself
import random

def simulate_family_strategy(num_families=100_000):
    """Stop after first boy. What fraction of children are girls?"""
    boys = girls = 0
    for _ in range(num_families):
        while True:
            if random.random() < 0.5:  # Boy born
                boys += 1
                break
            girls += 1  # Girl born, continue
    return girls / (boys + girls)

print(f"Fraction girls: {simulate_family_strategy():.4f}")  # ~0.5000

Still asked at (2026): Google, Meta, Airbnb (data scientist roles), and any company where ML engineers need to demonstrate awareness of data collection bias.

Scroll to Top