String to Integer (atoi): The Edge-Case Minefield

⏱ 6 min read

Implement atoi: a function that converts a string to a signed 32-bit integer.

This is LeetCode 8, and it is famous for one reason: it looks trivial and is not. The actual specification is a minefield of edge cases that almost nobody handles correctly on the first try. Whitespace, sign characters, leading zeros, overflow, non-numeric trailing characters, empty input, sign-only input, integer boundary values — the function has to handle all of them, and getting any one wrong is a wrong-answer in a unit test. The atoi question is the canonical “specification reading” problem in the interview canon.

The full specification

An atoi function should:

Skip leading whitespace. Tabs, spaces, newlines.
Read an optional sign character. A single ‘+’ or ‘-‘ before the digits. Multiple sign characters are invalid.
Read consecutive digits. 0–9 only. Stop at the first non-digit.
Stop at non-digit characters. Trailing letters, decimal points, anything not 0–9 ends the number.
Clamp to 32-bit signed integer range. Result must be in [−2³¹, 2³¹ − 1]. Anything outside clamps to the nearest boundary.
Handle the empty / sign-only / no-digit cases. Empty input or input with no parseable digits returns 0.

Each of these has a canonical input that people get wrong. ” 42″ should return 42 (whitespace skipped). “+-12” should return 0 (multiple signs is invalid). “4193 with words” should return 4193 (trailing words ignored). “-91283472332″ should clamp to −2147483648 (below INT_MIN). ” ” should return 0 (no digits). “+” should return 0 (sign only).

The canonical implementation

def my_atoi(s):
    INT_MAX = 2**31 - 1
    INT_MIN = -2**31

    i, n = 0, len(s)

    # Skip leading whitespace
    while i < n and s[i] == ' ':
        i += 1

    # Optional sign
    sign = 1
    if i < n and s[i] in ('+', '-'):
        if s[i] == '-':
            sign = -1
        i += 1

    # Read digits
    result = 0
    while i < n and s[i].isdigit():
        result = result * 10 + int(s[i])
        i += 1

    result *= sign

    # Clamp to 32-bit range
    return max(INT_MIN, min(INT_MAX, result))

Twenty lines including blanks. Looks straightforward. Almost every candidate produces something that looks like this on first attempt. And almost every first attempt has at least one bug.

The bugs candidates introduce

Forgetting to skip whitespace before the sign. The whitespace skip must come first. Some candidates do them in the wrong order or skip whitespace twice.
Allowing multiple signs. “+-12” or “–12”. Without explicit handling, candidates accidentally parse these as valid.
Overflow during accumulation. In languages without arbitrary-precision integers, result * 10 + digit can overflow before the clamp. Need to check overflow inside the loop. Python avoids this; C++ and Java do not.
Wrong clamp direction for negative overflow. Clamping to INT_MAX for negative overflow gives a positive answer when a negative one is required. Have to apply sign before clamping, or apply the sign-aware clamp.
Treating decimal point as a digit. “12.5” should return 12, not interpret the ‘.’ as part of the number.
Whitespace after sign. “+ 12” should return 0 (the space breaks parseability after the sign), not 12.
Not handling empty string or sign-only. “” or “+” should return 0; some implementations crash or return weird values.
Locale-specific digit characters. Some implementations use isdigit functions that match Unicode digits beyond 0–9. The spec usually requires 0–9 only.

What interviewers grade

The atoi question is rarely about the algorithm — there is no algorithm here, just a state machine. The signal layers are:

Did the candidate ask clarifying questions about the spec? A polished candidate, given a vague “implement atoi”, asks about whitespace, signs, overflow handling, and edge cases before writing code. The questions themselves are part of the signal.
Did they explicitly write down the spec they are implementing? Senior candidates often write a comment block listing the cases they will handle, then implement against that list. Junior candidates dive into code and discover edge cases as they hit bugs.
Did they handle each edge case in code? The full set of cases (whitespace, sign, overflow, trailing chars, empty input) should all be addressed.
Did they test on edge cases without prompting? Running through ” 42″, “-2147483649”, “+1234abc”, ” +0 1234″ out loud demonstrates rigor.
How did they handle the overflow check? In a language with bounded integers, the overflow check before arithmetic is the cleanest approach. In Python, post-clamp is acceptable.

The bar at FAANG is “produces correct code for all the edge cases the interviewer throws at it within 30 minutes”. Most candidates produce something that handles 70% of the edge cases and then iterate as the interviewer probes for bugs. Iterating cleanly without losing the original structure is part of the signal.

Why this is in the canon

atoi is famous because it is the cleanest example in the interview canon of a problem where reading the specification carefully is the actual skill being tested. There is no clever algorithm; the difficulty is entirely in the case analysis. This makes it a useful filter for engineers who can produce production-quality code (handling edge cases) versus engineers who can solve LeetCode-style problems but cannot ship reliable software.

The opposite of atoi is itoa (integer to string), which is similarly under-rated. Edge cases there: zero, negative numbers, INT_MIN (whose absolute value cannot be represented as a positive int in two’s complement), large numbers requiring buffer sizing.

The Joel Spolsky connection

Joel Spolsky’s classic 2005 essay “The Guerrilla Guide to Interviewing (version 3.0)” specifically called out atoi as a question he liked because it tested production-quality code. Spolsky’s argument: any candidate can solve the algorithm; only good engineers handle every edge case. The atoi question filters on the latter.

Spolsky’s framing has been influential. Many tech-interview rubrics still include atoi or atoi-style questions specifically to test edge-case handling, separate from the algorithmic rounds that test pure problem-solving.

Is it asked in 2026?

Sometimes, but less than it was in the 2010s. Modern interview rubrics tend to use newer questions — string-to-integer with specific format requirements, or “validate this URL”, or “parse this CSV row” — that test the same underlying skill (careful spec reading) without using the most-rehearsed question. The bare atoi appears mostly in second-tier tech firms or as a phone screen warmup at FAANG.

The skill the question tests, however, is permanently relevant. Engineers who handle edge cases well ship more reliable software, and any interview format that tests this skill — whether through atoi, parsers, or live debugging — is in active use across the industry.

Frequently Asked Questions

What is the time complexity?

O(n) where n is the length of the input string. Each character is examined at most a constant number of times.

Why is overflow handling so tricky?

In languages with bounded integers (C++, Java), the multiplication result * 10 can overflow before you have a chance to compare against INT_MAX. The cleanest fix is to check the predicate result > INT_MAX / 10 || (result == INT_MAX / 10 && digit > 7) before the multiplication.

Should atoi handle hexadecimal or octal?

The standard LeetCode spec says decimal only. The C standard library atoi behaves the same way. Other library functions (strtol with base 0) handle multiple bases.

How do you handle locale-specific digits?

The spec usually requires 0–9 ASCII only. Use explicit ASCII range check rather than locale-aware isdigit to be safe.

Is this still a useful interview question?

Yes for the underlying skill (careful spec reading + edge case handling), though the specific question is over-rehearsed. Modern variants test the same skill with newer surface problems.