How Over-Reliance on AI Hurts Young Engineers

Vibe coding can create false velocity, and negatively impact young engineers.

Dec 01, 2025

AI-assisted coding does not usually fail loudly. It fails politely, with code that compiles, passes a couple of happy-path checks, and looks plausible in review. That looks-right surface area is exactly what creates false velocity: teams feel faster while their systems quietly get harder to change, harder to secure, and harder to reason about.

This is the part of the AI debate that matters most. Over-reliance hits early-career engineers first, because they are still building the internal compass that tells them when code is correct, when it is merely not broken yet, and where real risk hides.

person coding using AI

TLDR

Security is the most dangerous version of looks right: Veracode found 45 percent of AI-generated samples introduced OWASP Top 10 vulnerabilities; Java had a 72 percent security failure rate; and XSS defenses failed in 86 percent of relevant samples¹.
At the organizational level, speed gains are repaid downstream: Harness reports 45 percent of AI-linked deployments lead to problems, and 72 percent of organizations have had at least one production incident caused by AI-generated code².
Young engineers do not just lose time. They lose skill acquisition opportunities such as debugging reps, architectural intuition, and security reflexes, and that debt compounds.
In a mid-2025 randomized controlled trial, experienced developers were 19 percent slower when using AI tools, yet believed they were about 20 percent faster, and forecast around 24 percent faster³.
The 2025 Stack Overflow Developer Survey shows the dominant pain point is almost right AI output: 66 percent cite it, and 45.2 percent say debugging AI-generated code is more time-consuming⁴⁵.

What is actually happening: draft speed rises, decision speed falls

AI massively increases draft throughput. You can go from nothing to working-ish code in seconds. That feels like progress.

But production software is not drafting. It is deciding:

What invariants must hold
What happens on failure paths
How this behaves under load, concurrency, weird inputs, and partial outages
What the threat model is
Whether this fits the system architecture

When the draft arrives instantly, teams often skip the hardest step: building the reasoning chain that makes code safe to operate. This is why the METR result matters so much. Even experienced developers were misled by the sensation of speed³.

Why this hits young engineers harder

Early-career engineering is a long apprenticeship in:

forming mental models
spotting smells
learning which shortcuts are fake
building debugging stamina

Over-reliance on AI steals those reps. The trap is subtle. Juniors do not think they are avoiding learning. They think:

I shipped
It works
The pull request was approved
I am fast now

But speed without understanding is not competence. It is borrowed capability, and borrowed capability gets called in during incidents. The Stack Overflow data reinforces this. Alongside frustration with almost-right code, 20 percent of developers report reduced confidence in their own problem-solving, and 16.3 percent say it is hard to understand how or why AI-generated code works⁵. This marks a shift in professional identity: from builder to operator of a code vending machine.

The looks-right taxonomy: where quality collapses without obvious failure

1. Happy-path correctness, failure-path chaos

AI tends to generate code that demonstrates success quickly. Handling retries, timeouts, partial failures, idempotency, backpressure, and observability is far less template-driven, so it is often thin. The result is software that passes demos and fails in production.

2. Security that works

The most dangerous code succeeds functionally while failing defensively. Veracode’s 2025 GenAI findings are blunt¹:

45 percent of samples introduced OWASP Top 10 vulnerabilities
Java had a 72 percent security failure rate
XSS defenses failed in 86 percent of relevant samples

None of these failures are obvious if review focuses only on whether the code runs.

3. Architecture rot through duplication and low refactoring

Even when each pull request looks fine, AI nudges codebases toward copy-and-paste sprawl.

GitClear’s analysis of 211 million changed lines between 2020 and 2024 shows refactoring-associated changes fell from roughly 25 percent in 2021 to under 10 percent in 2024, while cloned or duplicated code rose from 8.3 percent to 12.3 percent⁶.

This is how working code becomes a system that cannot be safely changed.

The organizational symptom: ship faster, bleed time later

Harness describes a pattern seen across teams: AI speeds up creation, but the bottleneck shifts into integration, QA, security, and incident response, where work is slower, more expensive, and senior-heavy².

Governance has not caught up. Cycode reports that 100 percent of surveyed organizations have AI-generated code in production, yet 81 percent lack visibility into where and how AI is used across the SDLC, and 30 percent say AI generates the majority of their code⁷.

The result is a perfect storm:

more code entering systems
less visibility into its origin
insufficient verification capacity

DORA data aligns with this pattern. Higher AI adoption correlates with reduced delivery stability, even as throughput increases ⁸ ⁹.

How this negatively affects young engineers in practice

When vibe coding becomes the default, several patterns emerge:

Output is confused with skill
Shipping becomes the primary feedback loop, and understanding becomes optional.
Debugging reflexes do not form
When AI code fails in non-obvious ways, engineers lack the mental models to localize the issue.
Architecture is not learned by doing
Engineers can add endpoints, but cannot explain why the system is structured as it is or how to change it safely.
Review blindness develops
AI code reads fluently. Fluency is not correctness. Over time, looks right becomes the heuristic.
The career ladder steepens
If entry-level work becomes generating code, entry-level roles increasingly demand what used to be mid-level judgment: test strategy, threat modeling, and system thinking.

What worked: turning AI from a crutch into a training partner

1. Treat AI output as a draft, not a decision

A simple pull request rule that changes behavior. No merge unless the author can explain:

invariants
top failure modes
which tests cover those failure modes
why the design fits the system

This forces understanding back into the loop.

2. Make tests and threats the primary prompt

Instead of prompting to write the feature, prompt for:

edge cases and failure modes
tests that would catch them
security risks

Then generate the smallest patch that passes.

3. Add guardrails for high-risk domains

For authentication, permissions, payments, and data handling:

require a short threat model
require a second reviewer
require security scanning in CI

Because functional correctness is not enough in these areas¹.

4. Run weekly manual-mode drills for juniors

One hour per week:

no AI
solve a real bug or implement a small change
compare with an AI-assisted solution afterward

This preserves the reps that create long-term independence.

5. Measure the right thing: outcome velocity

Track:

time from ticket start to production stability
AI-related rework
incident and rollback rates
how often senior engineers are pulled into AI code cleanup

If upstream speed rises while stability drops, velocity did not increase. Costs were simply deferred⁹.

The real risk is not that AI writes bad code. It is that AI writes code that feels safe to ship, while early-career engineers lack the calibration to know when it is not. Used well, AI can compress the boring parts and expand learning. Used blindly, it creates a generation of engineers who can produce code but cannot own it.

In production, ownership is the job.