AI Broke Technical Interviews. We have tried to fix them.

You are reviewing a candidate's submission.
The README is polished. The code is modular. The tests pass. Then you look closer: the average calculation divides by the wrong variable. The Dockerfile mounts node_modules at runtime. The API returns full database records to the frontend. The README mentions a Dockerfile.test that doesn't exist.
This is the new normal. AI can scaffold a passing take-home in 15 minutes. Your interview process was not designed for this.
The problem isn't that candidates are using AI
For 20 years, technical interviews had a predictable shape. LeetCode screens tested pattern recognition. Take-homes revealed how someone approached a problem from scratch — the structure they chose, the edge cases they caught, the quality of the README they wrote unprompted.
That ecosystem still exists. But it's no longer measuring what it was designed to measure.
The problem isn't that candidates are using AI. The problem is that they're not reviewing what the AI produces. And the deeper problem is that your evaluation framework can't tell the difference between:
A candidate who used AI as a tool and reviewed everything carefully
A candidate who pasted prompts and submitted whatever came back
A candidate who didn't use AI at all
When all three submit similar-looking code, what exactly are you grading?
The doubts you're now carrying
If you're running technical interviews, you're probably wrestling with some version of these:
Doubt | Underlying Problem |
|---|---|
Is the assessment still valid? | If two candidates submit similar code — one who spent three hours thinking and one who spent fifteen minutes prompting — and the code quality is comparable, what exactly is being graded? |
How much depth is there? | A candidate who can explain every design decision, defend the tradeoffs, and spot the flaw the interviewer planted is demonstrating something real. A candidate who cannot explain why the code is structured the way it is has demonstrated nothing |
Did they even check this? | A bug that survives because the code was never run reveals blind trust in the AI; a bug that requires domain expertise to catch reveals the limits of the candidate's knowledge. |
How much of this translates to actual work? | A candidate who can steer an AI to produce a passing submission may or may not be able to do the same thing six months into a role, on a codebase they did not scaffold themselves, with requirements that change mid-sprint. |
How do I grade this fairly? | Candidates who used AI and reviewed carefully end up indistinguishable from candidates who coded carefully without it. Candidates who used AI carelessly are easy to identify. The middle is murky |
These questions have no clean answers under the old model.
The open-book exam insight
Here's the mental model that makes this solvable:
Open-book exams divide a class into two groups. The first hears "open book" and thinks: I should know where everything is, practice problems, understand the material well enough to apply it under pressure. The second hears "open book" and thinks: I don't need to prepare at all.
The exam result is usually unambiguous. Open-book tests are harder than closed-book ones — because the questions require judgment, not recall.
AI-assisted interviewing has the same structure.
In an AI-assisted environment, syntax and boilerplate are commodities. If a candidate can pass your take-home by pasting the prompt into a model, the test was measuring the tool, not the candidate.
The goal is not to catch people using AI. It's to design evaluation conditions where AI fluency is necessary but not sufficient — where the candidate who has both judgment and tools is distinguishable from the candidate who has only tools.
AI readiness is tool leverage plus judgment, not tool dependence.
What to measure instead
The shift is from what code did they produce to how did they produce it, and do they own it.
Here's what that looks like at different levels:
Level | Key Question | Evidence of Competence |
Junior | Do they understand and validate what the AI produced? |
|
Mid-level | Did they design the architecture before generating the code? |
|
Senior | Are they making strategic trade-offs or delegating them to the model? |
|
The five competencies that matter -
When AI is part of the interview, you need to evaluate five things alongside traditional technical skills:
Competency | Junior | Mid-level | Senior |
Task Framing | Articulate what they need before prompting, not just solve this | Plan the structure before generating code | Decompose the problem into a coherent sequence of tasks |
Prompting and Steerability | Iterate when the first output is wrong | Steer with intent rather than brute-force retrying | Know when to reset versus when to refine |
Output Evaluation | Read and test what the AI produced | Catch logical errors and edge cases | Evaluate security, performance, and maintainability |
Design Continuity | Files import correctly, functions connect, no orphaned code | Consistency in decisions across the codebase | Architecture reflects their thinking, not the AI's defaults |
Decision Ownership | Explain what the code does and why it's structured that way | Identify where to override or redirect the AI | Articulate trade-offs made and alternatives considered |
The choice you’re making
AI has changed what technical interviews measure. It hasn't changed the underlying need: to identify who can reason, design, debug, and own decisions under constraint.
You have three options:
Ban AI and pretend it doesn't exist. This selects for candidates who can pass your specific format, not candidates who can work in the real world.
Allow AI and keep evaluating the same way. This selects for whoever has the best prompt-writing skills, not the best engineering judgment.
Redesign your evaluation to measure what matters when AI is available. This selects for candidates who have both the foundation and the fluency.
The third option is harder. It requires rethinking your rubrics, retraining your interviewers, and building evaluation infrastructure that can measure these five competencies systematically.
That's why we built Fairground
Fairground is designed around this framework. Every interview evaluates the five competencies alongside traditional technical skills. Candidates code with AI enabled. Interviewers get structured rubrics that separate AI fluency from AI dependency. The scoring layer tells you who has both the foundation and the tools.
If you want to hire engineers who can work in the world as it is becoming, you need assessments where AI helps, but judgment decides.

Get started with Fairground in just few mins.
Plug and Play. Works well with your existing ATS.
100 Free Credits


