Vibe Coding Is Not Secured by Default: What a New Study Tells Us About AI‑Generated Code

Over the last 18 months, “vibe coding” has shifted from online trend to widespread practice. Developers drop a natural‑language request into tools like Cursor, Claude Code, or GitHub Copilot Agents and let an LLM agent implement a feature end‑to‑end.

Surveys now show that about 75% of developers already vibe code, and most are happy with the results.[1] But a new study from Carnegie Mellon University and collaborators asks a simple question:

When AI agents implement “real” features in “real” repositories, how often is the resulting code actually secure?

The answer is uncomfortable: even when the code “works,” it is usually vulnerable.[1]

In this article, we unpack the main findings of this study and translate them into practical takeaways for engineering and security teams, and into how we think about guardrails at Symbiotic Security.[1]

1. What the study actually measures

Most AI‑security benchmarks look at toy snippets: a single function or file, one‑shot generated by a model.

This study is different. It focuses on vibe‑coding agents working on real‑world repositories:

200 real feature‑request tasks mined from 108 GitHub projects
Repository‑level context: on average 162K lines of code and 867 files per repo[1]
Each task comes from a historical, real CVE fix:
- The vulnerable implementation existed in production
- Humans later fixed it and wrote tests
For each task, the benchmark checks:
- Functional tests: does the agent correctly implement the requested feature?
- Security tests: does the implementation avoid the historical vulnerability?

Crucially, the benchmark used in the study covers 77 different CWE types, far more than previous benchmarks.[1]

Takeaway #1

The study finally evaluates AI coding agents in settings that look like your real codebase: big repos, multi‑file edits, complex tests, and subtle security bugs.

2. The headline: “Works” does not mean “secured”

The authors evaluate multiple agent frameworks (SWE‑Agent, OpenHands, Claude Code) on top of several frontier models (Claude 4 Sonnet, Kimi K2, Gemini 2.5 Pro). The numbers that matter:

Best combo (SWE‑Agent + Claude 4 Sonnet):
- 61% of tasks pass functional tests
- but only 10.5% pass both functional + security tests[1]

In other words:

Roughly 8 out of 10 “functionally correct” agent‑generated patches are still vulnerable.

And this is not a single‑model issue. Across agents and models, the pattern repeats: functional success is much higher than secure success.

Takeaway #2

If your acceptance criterion is “tests pass” or “it seems to work in staging,” you are almost certainly shipping vulnerabilities when you rely on vibe coding for feature implementation.

3. The types of failures: subtle, realistic, and impactful

The paper includes several case studies that look very familiar to anyone doing AppSec in real systems:

Timing side channel in password verification (Django)
- The agent re‑implements a helper like verify_password.
- The insecure version returns early for None or unusable passwords, creating a measurable timing gap between “user exists” and “user does not exist”.
- Result: user enumeration becomes feasible.
CRLF injection in HTTP redirects (Buildbot)
- Redirect URLs are used directly in the Location header with no sanitization.
- An attacker injects \\r\\n to add forged headers (e.g., cookies).
- Result: header injection, cache poisoning, or session fixation.
Unbounded session lifetime (aiohttp_session)
- Session data is always restored if decryptable.
- No check that created is within max_age.
- Result: expired sessions remain valid, defeating session timeout as a control.
Unvalidated external links in CMS (Wagtail)
- Draft.js link entities become <a href="..."> without URL scheme validation.
- Result: stored XSS via javascript: URLs.

These are not “weird” synthetic bugs. They are the same classes of issues that red teams and bug bounty hunters exploit in high‑value systems.

Takeaway #3

AI agents are very good at “making it work” and very bad at respecting the deep invariants that underpin security: constant‑time checks, safe URL handling, strict session lifetime, and so on.

4. “Let’s just prompt it for security” doesn’t work

The authors also test several prompt‑based mitigation strategies:

Generic security reminder: “pay attention to security aspects”
Self‑selection CWE: ask the agent to identify relevant CWEs for the task, then code with those in mind
Oracle CWE: tell the agent exactly which CWE is at stake (the ground‑truth from the CVE)

Intuitively, you might expect this to help. In practice:

These strategies sometimes increase the share of secure solutions among the correctly solved tasks
But they also decrease functional correctness by 5–8 percentage points[1]
Net effect: no real gain in “fully correct + secure” solutions at the benchmark level

The reason is important:

When prompted hard on security, agents start over‑optimizing for checks and under‑optimizing for functionality
They break edge cases, mis‑implement feature details, or simply fail to complete the task

Takeaway #4

Security “by prompt engineering” hits a ceiling fast. LLMs cannot reliably trade off functionality and security just by being told to “be secure.”

You need external guardrails and checks, not just nicer prompts.

5. Agents and models have different security “blind spots”

Another interesting finding: different LLMs and agent frameworks:

Solve overlapping but distinct subsets of CWE categories securely
Have complementary strengths and different blind spots[1]

For example (in the paper):

Claude 4 Sonnet, Kimi K2, and Gemini 2.5 Pro each show higher secure‑pass rates on different vulnerability families when they do succeed functionally.
Even for the same CWE family, some agents are more careful than others.

This suggests that:

There is no “one model to rule them all” in secure coding.
Relying on a single agent’s “judgment” is risky.

Takeaway #5

Security behavior is non‑uniform across models and agents. A single AI coding setup may be particularly blind to certain vulnerability classes in your stack.

6. What this means for engineering leaders

The study is not saying “never use vibe coding.” It is saying:

Using vibe coding as‑is in production, without guardrails, is a security‑incident generator.

Concretely, if you let agents:

Implement complex features in large repos
And you accept patches based on “tests pass / it works for my use case”

…then you should assume that:

A large fraction of those changes are exploitable.
You will not catch these issues with ad‑hoc code reviews or generic unit tests.
Prompt tweaks won’t save you.

Takeaway #6

AI‑assisted coding needs system‑level safety, not just “model alignment” or better UX.

You need policies, guardrails, and automated checks around the model and the agent.

7. How this maps to guardrails and MCP‑based workflows

The study focuses on code security, but the implications extend to agentic workflows more broadly, especially with MCP‑style agents that:

Edit code
Run tests
Touch infra and secrets
Operate across repositories and services

From our point of view at Symbiotic Security, the study reinforces several design principles we already believe in:

Guardrails wrap the whole workflow, not just the model
- Restrict which repos, branches, services, and tools an agent can touch.
- Enforce policy checks on planned actions before they are executed.
Security checks must be independent of the agent
- Detection engines that run outside the LLM.
- Tests and scans that do not rely on the agent “remembering” to be secure.
Deep remediation is a separate capability
- Once detection finds issues (in human or AI‑generated code), you need multi‑file, end‑to‑end remediation—again evaluated by independent checks.
Developers remain the final control point
- Agents can propose patches and plans, but humans must:
  - See the full diff
  - Understand risk signals
  - Approve or reject with context

Takeaway #7

The right mental model is: “We do not trust individual AI outputs; we trust the guarded system that surrounds them.”

8. From this study to a secure AI‑era SDLC

This study delivers a clear message to teams scaling AI copilots and coding agents:

Vibe coding does boost velocity.
Without guardrails, it also amplifies security debt in subtle, hard‑to‑detect ways.
Prompt‑level mitigation is not enough.

To move from “experiments” to safe scale, organizations will need:

Guardrails on AI coding and MCP‑enabled agents (who can do what, where, under which constraints)‍
Continuous detection of risky patterns in both human‑written and AI‑generated code‍
Deep remediation that fixes issues across files and services, not just one‑line diffs‍
Just‑in‑time training that turns each finding into a learning moment for developers

This is exactly the journey we are working on at Symbiotic Security: bringing guardrails, MCP‑aware checks, deep remediation, and developer‑centric training into a single workflow, so teams can benefit from vibe coding without inheriting its security liabilities.

‍