How many developers actually verify AI code?

According to Sonar's 2026 State of Code survey, only 48% of developers say they always check AI-assisted code before committing - while 96% say they do not fully trust that AI-generated code is functionally correct. That distance between distrust and verification practice is the verification gap.

How big is the verification gap exactly?

Numerically: 96% distrust minus 48% consistent verification leaves roughly half of all developers merging code they do not fully trust without always checking it. On top of that, AI already accounts for about 42% of committed code in the same survey - so the unchecked share applies to a large and growing volume.

Where do these numbers come from?

Primarily from Sonar's State of Code Developer Survey, published in 2026 - a large developer survey accompanied by a press release and a full report. The figures on this page are quoted with their source and year; where a number comes from a different study (Faros AI telemetry, GitClear repository analysis), that source is named inline.

Why do developers skip verification although they distrust the code?

The survey points to capacity, not laziness: 38% report that reviewing AI code takes more effort than reviewing a colleague's, and telemetry studies show review time per PR rising sharply as AI volume grows. When generation doubles and verification capacity stays flat, skipping becomes the path of least resistance - that mechanism is what verification debt describes.

Does the gap actually cause harm, or is it theoretical?

It shows up in outcome data: 61% of developers report AI code that looks correct but is not reliable, and Sonar's comparison found teams with a systematic verification process were 44% less likely to experience outages caused by AI-generated code. The gap is not an opinion problem - it correlates with production incidents.

Is the gap closing or widening?

The pressure side is growing: developers in the survey expect AI's share of committed code to rise from about 42% toward 65% by 2027. Whether the verification side keeps pace depends on process - the survey shows practices, not destiny. This page states the 2026 numbers with their sources and will be updated when new survey waves publish.

Concept

The Verification Gap

Last updated: 2026-07-023 min read

The verification gap is the measured distance between distrust and practice in AI coding: 96% of developers do not fully trust that AI-generated code is functionally correct, yet only 48% always check it before committing (Sonar, State of Code survey 2026). Roughly half of all developers routinely merge code they do not fully trust.

Contents

What the survey actually found

Sonar’s State of Code Developer Survey drew an unusually sharp picture because it asked both sides of the same question: how much do you trust AI code, and what do you actually do about it. The answers do not line up - and that mismatch, not any single number, is the finding. Developers are not naive about AI output; they are under-resourced to act on their own skepticism.

Number	What it measures	Source
96%	Developers who do not fully trust AI-generated code to be functionally correct	Sonar, State of Code 2026
48%	Developers who always check AI-assisted code before committing	Sonar, State of Code 2026
61%	Report AI code that 'looks correct but isn't reliable'	Sonar, State of Code 2026
42%	Share of committed code already written by AI (expected: 65% by 2027)	Sonar, State of Code 2026
38%	Say reviewing AI code takes more effort than reviewing a colleague's	Sonar, State of Code 2026
82%	Agree AI helps them code faster	Sonar, State of Code 2026
+98% / +91%	More merged PRs / longer review time per PR in high-AI teams	Faros AI telemetry, 2026
3.1% → 5.7%	Two-week code churn drift 2020–2024 across 211M changed lines	GitClear, 2025
−44%	Fewer AI-code-caused outages in teams with systematic verification	Sonar, State of Code 2026

The key numbers behind the verification gap, each with its source and year - a reference table meant to be quoted with attribution.

Why distrust doesn't translate into checking

The gap is a capacity story. The same developers who distrust the output also report that verifying it is disproportionately hard: reviewing AI code takes more effort than reviewing a colleague’s (38%), and telemetry shows review time per PR rising 91% while merge volume nearly doubles. Skepticism without capacity degrades into resignation - the code ships anyway, unchecked, and every such merge adds to the team’s verification debt.

The outcome side makes the gap expensive: 61% have seen AI code that looked correct but was not reliable - the failure mode that slips past exactly the kind of quick review the gap produces. And the counterfactual exists in the same dataset: teams with a systematic verification process report 44% fewer AI-code-caused outages.

How teams close the gap

Closing the gap does not mean reviewing harder - it means changing what arrives at review. The working levers, each covered in depth in the methods section:

Written intent per task – machine-checkable specifications give verification a reference the model cannot influence.
A check instead of a feeling – the spec-vs-implementation check turns “looks right” into criterion-by-criterion yes/no.
Measured progress – four metrics show whether the gap actually narrows, starting with the unverified-merge rate.

Where Reality Graph fits

Reality Graph exists because of exactly this gap: it makes the check cheap enough that distrust can turn into verification instead of resignation - written task intent, boundary and criteria checks per run, and an evidence report that records what was actually verified.

These numbers tell you

The gap is measured, recent, and large - not anecdote
Capacity, not carelessness, drives the skipping
Verification correlates with 44% fewer AI-code outages
The volume side (42% → 65%) keeps growing

They do not tell you

Your team's own gap - measure it locally
That any single tool closes it - process does
That AI code is worse than human code per se
Anything about surveys after 2026 - check for new waves

If these boundaries fit how your team wants to ship:

Get early access See how it works

FAQ

How many developers actually verify AI code?: According to Sonar's 2026 State of Code survey, only 48% of developers say they always check AI-assisted code before committing - while 96% say they do not fully trust that AI-generated code is functionally correct. That distance between distrust and verification practice is the verification gap.
How big is the verification gap exactly?: Numerically: 96% distrust minus 48% consistent verification leaves roughly half of all developers merging code they do not fully trust without always checking it. On top of that, AI already accounts for about 42% of committed code in the same survey - so the unchecked share applies to a large and growing volume.
Where do these numbers come from?: Primarily from Sonar's State of Code Developer Survey, published in 2026 - a large developer survey accompanied by a press release and a full report. The figures on this page are quoted with their source and year; where a number comes from a different study (Faros AI telemetry, GitClear repository analysis), that source is named inline.
Why do developers skip verification although they distrust the code?: The survey points to capacity, not laziness: 38% report that reviewing AI code takes more effort than reviewing a colleague's, and telemetry studies show review time per PR rising sharply as AI volume grows. When generation doubles and verification capacity stays flat, skipping becomes the path of least resistance - that mechanism is what verification debt describes.
Does the gap actually cause harm, or is it theoretical?: It shows up in outcome data: 61% of developers report AI code that looks correct but is not reliable, and Sonar's comparison found teams with a systematic verification process were 44% less likely to experience outages caused by AI-generated code. The gap is not an opinion problem - it correlates with production incidents.
Is the gap closing or widening?: The pressure side is growing: developers in the survey expect AI's share of committed code to rise from about 42% toward 65% by 2027. Whether the verification side keeps pace depends on process - the survey shows practices, not destiny. This page states the 2026 numbers with their sources and will be updated when new survey waves publish.

Keep reading

ConceptThe AI Code Review BottleneckGeneration got cheap, reading did not: merged PRs nearly double while review time rises 91%. The mechanics of the new constraint, the measured numbers, and what actually relieves it.ConceptComprehension DebtThe gap between the code a team ships and the mental model its people hold of it - rooted in Naur's theory building, accelerated by AI, and compounding with verification debt.All articlesThe whole collection – 30 cited, dated guides on verifying AI-generated code.