Prompt Engineering During a Coding Interview: What Interviewers Grade

⏱ 6 min read

In an AI-permitted coding interview, the rubric shifts. The interviewer is not grading the code (the AI can produce code); the interviewer is grading the candidate’s interaction with the AI. The interview becomes a prompt engineering test more than a coding test, and the candidate who has spent thousands of hours coding without an AI is not necessarily ahead of the candidate who has spent hundreds of hours pair-programming with one.

This piece is a guide to what interviewers are actually looking for in those interviews. Use it as a study aid before AI-permitted interviews; the prompting habits described here can be deliberately practiced.

The four-part rubric

Across the companies that have written down explicit AI-collaborative coding rubrics, four dimensions show up consistently:

Prompt clarity. Did the candidate articulate the problem to the AI in a way that produced useful output? Did they iterate the prompt when the first attempt was off?
Verification rigor. Did the candidate read the AI’s output critically? Did they catch the bugs, or accept the output uncritically?
Task decomposition. Did the candidate break the problem into AI-sized chunks — small enough to verify, large enough to make meaningful progress?
Integration and judgment. Did the candidate compose the AI’s output into a coherent solution? Did they decide when to use the AI’s suggestion vs override it with their own approach?

The candidate who scores well on all four does not have to be a prompt engineering virtuoso. They have to be someone who is reliably effective at directing a fast-but-fallible junior collaborator. That description has not changed since the 1990s; what changed is who the junior is.

Prompt clarity in practice

The bad prompt: “Write me a function to sort an array.”

The good prompt: “Write a Python function called sort_intervals that takes a list of (start, end) tuples and returns the same list sorted first by start time, then by end time descending. Handle the edge case of equal start times by preferring the longer interval. Use only the standard library. Add type hints.”

The good prompt is specific about input shape, expected output, edge cases, constraints, and code style. It does not assume the AI knows what you mean. Interviewers grade clarity not because verbose prompts are inherently better — they are not always — but because clear specification is hard, and the ability to do it under pressure correlates with engineering judgment broadly.

Verification rigor in practice

This is the dimension where most candidates fail in AI-permitted interviews. The AI produces a function, the candidate accepts it, the function has a subtle bug (an off-by-one, a wrong default, a missing case), and the candidate moves on. The interviewer has watched this happen with AI tools daily for two years; they know exactly what the failure mode looks like.

Strong verification habits to demonstrate visibly:

Read the AI’s output line by line, narrating what each part does.
Trace through the logic on a sample input — do not just run it and trust the output.
Ask the AI explicitly about edge cases (“What happens if the list is empty? What if there are duplicate start times?”) and verify the AI’s claims rather than taking them at face value.
Run the code with at least one input you constructed yourself, not just the AI’s example inputs. The AI tends to test against the inputs it was thinking of when generating the code.

Senior candidates often verbalize the verification step explicitly: “Let me trace through this with a sample input to make sure the logic is right before we move on.” This is not performative; it shows the interviewer the verification skill in action.

Task decomposition in practice

The mistake: asking the AI for the entire problem at once. “Write me a complete URL shortener service.” The AI produces 200 lines of code, the candidate cannot effectively verify all of it, and the interviewer cannot tell whether the candidate would have written it themselves.

The right pattern: decompose first, prompt for chunks. “Let’s start with the data model. We need a mapping from short codes to long URLs, with a way to detect collisions. Can you sketch the data structures we’d use?” Then, after that is right: “Now let’s implement the encode function — given a long URL, generate a short code.” And so on.

Each chunk is verifiable in a minute or two. The candidate stays in control. The interviewer can see the candidate’s architectural thinking by which chunks they ask for and in what order.

Integration and judgment in practice

This dimension shows up most in the moments when the candidate disagrees with the AI. The AI suggests an approach; the candidate has a reason to prefer a different one. The strong candidate articulates that reasoning explicitly:

“The AI suggested using a hash map keyed on the URL, but we want short-code-to-URL lookups to be fast, so I’d flip that. Let me redo this.”

The weak candidate accepts whatever the AI says. The interviewer can tell because the candidate’s solution looks like an AI’s solution — competent but undifferentiated. The strong candidate’s solution looks like an engineer’s solution that happened to be written with an AI’s help.

Common mistakes

Vague prompts. “Help me solve this problem” gives the AI nothing to work with and burns time on iterations.
No verification. Accepting AI output uncritically. Interviewers read this as “candidate is not actually engineering.”
Over-decomposition. Asking the AI for ten lines at a time when the natural chunk is fifty lines. Wastes time, signals lack of confidence.
Under-decomposition. Asking for the whole solution at once. Loses control of the verification.
Performative AI use. Using the AI to look modern, when the actual problem is small enough that direct coding would be faster. Use the AI when it helps; use yourself when it does not.
Hostile prompting. Prompts that read as adversarial (“don’t write any extra code, don’t add comments, don’t suggest improvements”) often produce worse output. Friendly, clear, specific prompts work best.
No self-narration. The interviewer cannot see your thinking through the AI; they need you to narrate why you are asking what you are asking.

How to practice

Three habits to build over the weeks before an AI-permitted interview:

Pair-program with an AI tool on real engineering work. Pick a project, set a one-hour timer, and try to make meaningful progress entirely through AI direction. Notice where you fumble — that is where to practice.
Verify aggressively. For one week, do not accept any AI output without tracing through it on a sample input. The verification habit becomes automatic with practice.
Record yourself. Some candidates film themselves doing AI-collaborative coding and watch the playback. The recording reveals how you sound when you narrate, where your prompts get sloppy, and which patterns are actually unconscious.

Frequently Asked Questions

Should I use voice prompting or typed prompting?

Typed in 2026 — voice prompting tools are still emerging and most interviewers will not have set up the tooling for voice. Type your prompts in a chat window where the interviewer can see them.

What if the AI gives me code I don’t fully understand?

Slow down. Ask the AI to explain it line by line. If you do not understand the explanation, that is a sign you are using the AI to skip a step you needed to do yourself. Step back, ask the AI to explain the underlying concept first, then revisit the code.

Should I show my prompts to the interviewer or hide them?

Show them. The interviewer is grading the prompt as part of the work. Hiding it would defeat the rubric.

What if the AI is wrong and I do not catch it?

Acknowledge the miss when you find it. “I missed that earlier — the AI’s solution had a bug, and I should have caught it. Let me redo this.” Honest acknowledgment recovers ground; pretending you noticed all along reads as defensive.

How is this different from regular pair programming?

Mostly the same, with two differences: the AI is faster than a human partner, and the AI is more confidently wrong. The first means you can iterate faster; the second means verification matters more.