Securing AI-Generated Code Through Runtime Verification

Nestria AI Research Team
Sep 19, 2025
2 min read

Overview

AI-driven code generation tools like GitHub Copilot, Claude Code and numerous advanced coding tools are transforming software development. They enable faster prototyping, accelerate development cycles, and broaden programming access. However, this innovation also brings a challenge: trustability — ensuring that AI-generated code can be relied upon for reliability, security, and compliance.

The Problem

Despite their benefits, AI coding assistants produce code that often lacks provenance, may embed insecure patterns, and can be vulnerable to prompt injection and other adversarial attacks. Traditional safeguards fall short:

Static analysis struggles with unconventional AI-generated structures.
Dynamic testing cannot anticipate the full range of execution paths.

These limitations create blind spots, particularly concerning in sensitive and regulated industries such as finance, healthcare, and critical infrastructure.

The Scale of the Challenge

In our research analyzing 1,247 AI-generated programs across multiple safety-critical domains, we identified more than 800 vulnerabilities. These included memory errors, input validation flaws, and timing side-channels. Notably, many of these vulnerabilities bypassed existing verification methods. Developers often accepted AI-generated suggestions with misplaced confidence — deepening the risk.

Our Solution: Runtime Verification

To address this gap, Nestria AI developed Copilot-RV, a runtime verification framework that continuously checks programs against formally defined safety, liveness, and security properties. Unlike static approaches, runtime verification ensures violations are caught during actual execution.

Key outcomes from our evaluation include:

94.3% detection of injected vulnerabilities
Minimal runtime overhead of 3.7%
Sub-millisecond latency for most flaw detection
73% reduction in manual verification effort through a hybrid approach combining static, runtime, and selective formal checks

Copilot runtime verification Nestria AI — Copilot-RV

Why It Matters

Runtime verification bridges the gap between the speed of AI-assisted development and the assurance required by enterprises. It strengthens trustability, enabling organizations to harness AI productivity while meeting security and compliance demands.

Recognition

We are proud that this research has been peer-reviewed and accepted for presentation at the International Conference on Artificial Intelligence and Cybersecurity (Osaka, November 2025). This recognition affirms the importance of runtime verification in enabling enterprises to confidently embrace AI-driven software development.