AI Coding Agents: Notes on the Verification Loop
Fast generation is useful. Blind acceptance is expensive.
I now assume AI output is a draft until proven otherwise.
Minimal loop I rely on
- Generate patch
- Ask "what can break?"
- Test contracts + edge paths
- Add observability where failure would hurt
- Merge only when behavior is explainable
No loop, no trust.
What usually slips through
- happy-path-only logic
- subtle boundary/contract mismatch
- thin error handling
- retries/timeouts not thought through
- "looks right" code that is hard to operate
PR gate that keeps quality up
Before merge, author should answer:
- What invariant must hold?
- Which test checks it?
- What is still untested?
- How will production tell us it is broken?
If these are unclear, the patch is not done.
Trend signals behind this note
- OpenAI expanded agent tooling on March 11, 2025: New tools for building agents.
- Stack Overflow 2025 survey shows heavy AI coding usage but persistent trust and accuracy concerns: AI section, 2025 survey.
Sticky takeaway
Use AI coding agents inside a verification system, not as a replacement for one.