Vibe Coding and the Rise of Security Debt
Vibe Coding and the Silent Surge of Security Debt
You’ve seen it happen: a developer describes a feature in plain English, hits enter in their AI assistant, and seconds later, working code appears. It compiles. It runs. It ships.
This is vibe coding—a shorthand for using large language models (LLMs) to generate functional code based on intent, often skipping design docs, threat modeling, and security review. It feels efficient. It moves the needle. But beneath the surface, it’s accelerating a new form of infrastructure debt: security debt, quietly embedding itself in production systems.
At Eleven11, we audit infrastructure for engineering-led startups at Series A through C. We don’t advise from the outside—we run our own tooling first, validate it in real conditions, then bring those insights to clients. What we’re seeing now isn’t just faster development. It’s a shift in how risk enters systems—with AI-generated code acting as an unmonitored vector.
The data supports this: AI-generated code introduces vulnerabilities at a higher rate than human-written code. Veracode’s 2025 GenAI Code Security Report found that nearly 45% of AI-generated code samples contained OWASP Top 10 vulnerabilities. Other studies place that number as high as 53% in enterprise environments. More concerning? The trend isn’t improving. Despite model updates, vulnerability rates remain stagnant.
This isn’t a condemnation of AI. It’s a warning about process gaps.
Why Functionality Masks Insecurity
LLMs are trained on public code—GitHub repos, Stack Overflow threads, documentation. That corpus includes both secure and insecure patterns. But the models are optimized for correctness, not security. They aim to produce code that works, not code that’s safe.
When asked to “build a login API,” an LLM won’t prompt you to consider rate limiting, session expiration, or credential stuffing protections. It generates what it’s seen before: a working endpoint, often with weak or missing safeguards.
Common results include:
- SQL injection from unsanitized inputs
- Cross-site scripting (XSS) due to improper output encoding
- Hardcoded credentials in source files
- Insecure defaults, like permissive CORS or missing HSTS
- Broken access controls, where authorization logic is incomplete or absent
These aren’t rare edge cases. They’re common outputs—especially when prompts lack explicit security constraints.
And because vibe coding often happens outside formal workflows—on internal tools, scripts, or quick integrations—these flaws bypass review and land in production unchecked.
The Hidden Attack Surface
A vulnerable script in isolation is low risk. The same script, connected to production data, becomes a liability.
As Retool has observed, the danger of AI-generated code scales with integration. A dashboard pulling from staging data becomes a threat when it’s granted access to customer databases. A script using test credentials becomes critical when it’s issued a service account with real permissions.
We’ve audited companies where developers used AI to build internal tools—dashboards, automation scripts, API wrappers—that pulled from core services, authenticated via long-lived tokens, and exposed endpoints with minimal validation. None were in the original roadmap. All were built quickly. All introduced new attack surfaces.
This is Shadow Code: not unauthorized software, but unreviewed, unvetted code, generated on-demand, deployed without oversight, and linked to critical systems.
Unlike legacy tech debt, which often shows up as performance issues or deployment failures, security debt from AI-generated code is silent. It doesn’t break the build. It doesn’t trigger alerts. It waits—sometimes for months—until exploited.
Static Analysis Falls Short
You might think: We scan dependencies. We use Snyk. We run static analysis.
That’s necessary—but insufficient.
SAST tools are good at finding known patterns: hardcoded secrets, dependency CVEs, common injection flaws. But they miss semantic vulnerabilities—especially those rooted in business logic.
Example: an AI generates a discount engine that applies promotions based on user roles. The code compiles. No secrets. No SQLi. But the logic allows a user to manipulate query parameters and apply admin-only discounts.
This is a business logic flaw. It won’t show up in a static scan. It requires dynamic testing—observing how the system behaves under real conditions.
Similarly, authorization bypasses, race conditions, and insecure direct object references (IDOR) are often invisible to static tools. They only surface when the system is exercised like an attacker would.
As one researcher put it: “The live application becomes the proving ground for its security.” That’s a dangerous way to validate code handling real data.
The Missing Feedback Loop
Here’s how vibe coding typically unfolds:
- Developer asks: “Write a function to process user uploads.”
- LLM returns code that saves files to disk.
- Developer tests with a
.txtfile. It works. - Code is committed, deployed.
- Later, an attacker uploads a
.phpfile and executes remote code.
Where was the security check? Nowhere.
No threat model. No input validation. No review. The developer assumed correctness equaled safety.
And because vibe coding is conversational, with developers iterating by pasting code back into the model, errors can compound. The model optimizes for what the developer asks—not what they should be asking.
Databricks found that even with security-focused prompts, vulnerability rates remain high, especially for complex logic. LLMs lack an intrinsic understanding of least privilege, defense in depth, or secure defaults.
They reflect patterns—not principles.
Infrastructure Debt, Reimagined
At Eleven11, we define infrastructure debt as the gap between how systems are built and how they should be built to withstand real-world stress. It spans five vectors: reliability, scalability, security, observability, and team structure.
Vibe coding is widening that gap—especially in security.
We’ve seen startups with dozens of AI-generated microservices, each built by a different engineer, each using slightly different patterns, each with its own security blind spots. When we run our Dhara security audit engine across these environments, the findings are consistent: repeated vulnerabilities across services, all traceable to similar LLM-generated templates.
This isn’t just a developer issue. It’s an engineering leadership issue.
What Engineering Leaders Should Do
If you’re a CTO or VP of Engineering at a Series A-C company, you’re under pressure to move fast. Vibe coding feels like a force multiplier. But unchecked, it becomes a liability.
Here’s how to respond—without sacrificing velocity.
1. Treat AI-Generated Code Like Third-Party Code
You wouldn’t pull a random package from npm into production without review. Don’t treat AI-generated code differently.
Establish a code acceptance policy requiring:
- Security review for any AI-generated code touching production
- Manual validation of input handling, auth, and access controls
- Dependency checks—LLMs often suggest outdated or vulnerable libraries
- Architecture alignment—does this fit the system’s security model?
This isn’t about blocking AI. It’s about shifting security left.
2. Enforce Security-Aware Prompting
The prompt shapes the output.
Instead of “Write a function to process webhooks,” use:
“Write a secure webhook handler in Python using FastAPI. Validate the signature using HMAC-SHA256, sanitize all inputs, enforce rate limiting, and log all requests. Use environment variables for secrets. Follow OWASP API Security Top 10.”
Explicit prompts produce safer code. They force the model to consider constraints it wouldn’t otherwise.
Train your team on secure prompting patterns. Make them part of onboarding and code reviews.
3. Add Dynamic Testing to CI/CD
SAST is not enough.
Integrate dynamic application security testing (DAST) into your pipeline. Tools like Autonoma generate end-to-end tests that simulate real attack patterns.
At Eleven11, we combine automated scanning with manual red teaming to uncover logic flaws that static tools miss. For vibe-coded apps, this is essential.
4. Audit AI-Suggested Dependencies
LLMs don’t just write code—they recommend libraries.
We’ve seen AI suggest requests==2.25.1 (which has known CVEs) instead of the latest version. We’ve seen it recommend flask-talisman but disable HSTS in the configuration.
Every AI-suggested dependency is a potential supply chain risk.
Run software composition analysis (SCA) on all new dependencies—especially those introduced via AI. Consider maintaining an approved library list for common tasks.
5. Monitor for Exploitation Patterns
Once deployed, watch for signs of abuse.
- Are there unexpected database queries?
- Are files being written to unusual paths?
- Are there spikes in failed auth attempts?
Use observability tools to establish baselines and detect anomalies. At Eleven11, we build custom alerts for common AI-generated flaws—like unparameterized SQL queries or missing auth checks.
6. Own the Process, Not Just the Output
Engineering leaders must own the process, not just the deliverables.
Ask in sprint reviews and architecture meetings:
- Who generated this code?
- How was it validated?
- What attack surface does it introduce?
These should be standard questions—just like scalability or reliability.
The Bottom Line
Vibe coding isn’t going away. It’s too useful, too fast, too embedded in how teams work.
But speed without safety is technical arson.
The vulnerabilities from AI-generated code aren’t hypothetical. We’ve audited companies that shipped code with exposed API keys, insecure deserialization, and broken access controls—all in code that “worked fine” during testing.
The answer isn’t to ban AI. It’s to treat AI-generated code like any other high-risk component: review it, test it, monitor it, and own it.
Because infrastructure debt doesn’t care how the code was written. It only cares that it’s there.
And when the incident happens—because it will—the question won’t be, “Did the AI make a mistake?”
It’ll be, “Why didn’t you catch it?”