Code verification for AI agents
Run AI-generated code in isolated Docker containers before it reaches your users.
Catch runtime errors, security issues, and AI hallucinations — locally, for free.
Quick Start • Why Proof • CLI • API • Rules • CI
npx proof-ai check ./generated.py
# ✓ Verification passed — 1/1 blocks, no issuesEvery AI coding tool — Cursor, Copilot, Devin, ChatGPT — generates code that looks right but might:
- Not run — missing imports, wrong syntax, hallucinated packages
- Be insecure — hardcoded API keys, eval(), SQL injection
- Be incomplete —
pass,// TODO, placeholder values
You shouldn't ship code you haven't tested. Proof tests it for you.
| Proof | Guardrails AI | Manual review | |
|---|---|---|---|
| Runs the code | Yes (Docker sandbox) | No | Sometimes |
| Catches runtime errors | Yes | No | Maybe |
| Free & local | Yes (just Docker) | Cloud API | Yes |
| Zero API keys | Yes | Requires key | Yes |
| CI-native | Exit code 0/1 | SDK only | Manual |
| AI-specific rules | 10 built-in | Structured output | None |
- Node.js >= 18
- Docker (for sandbox execution) — Install Docker
Docker is optional. Without it, Proof still runs 10 built-in rules + syntax checks. Docker adds actual code execution in isolated containers.
npm install proof-ai# Verify a file
npx proof-ai check ./script.py
# Verify a code string
npx proof-ai verify --code 'print("hello")' --language python
# Verify an LLM response (all code blocks)
npx proof-ai verify --text "$(cat response.md)"
# Pipe from stdin
echo 'console.log("hi")' | npx proof-ai verify --stdin --language javascript
# JSON output (for CI)
npx proof-ai check ./script.py --json
# Skip sandbox (rules + syntax only)
npx proof-ai check ./script.py --sandbox none
# Only security rules
npx proof-ai check ./script.py --rules securityimport { verify } from "proof-ai";
// Verify a code string
const result = await verify({
code: 'import pandas as pd\ndf = pd.read_csv("data.csv")',
language: "python",
});
if (!result.passed) {
for (const issue of result.issues) {
console.log(`[${issue.severity}] ${issue.message}`);
}
}// Verify all code blocks in an LLM response
const result = await verify({
text: llmResponse,
sandbox: "docker",
rules: "all",
});
console.log(`${result.stats.passedBlocks}/${result.stats.totalBlocks} blocks passed`);// Verify a file
const result = await verify({ file: "./generated.py" });| Option | Type | Default | Description |
|---|---|---|---|
code |
string |
— | Code string to verify |
text |
string |
— | Markdown/text containing fenced code blocks |
file |
string |
— | Path to a file to verify |
language |
string |
auto-detect | python, javascript, typescript |
sandbox |
boolean | "docker" | "e2b" |
true |
Sandbox provider (auto-detect, force Docker/E2B, or disable) |
rules |
string | Rule[] | false |
"all" |
"all", "security", "ai-mistakes", "code-quality", custom array, or false |
install |
string[] |
— | Packages to install in sandbox (e.g., ["pandas"]) |
env |
Record<string, string> |
— | Environment variables for sandbox |
timeout |
number |
30 |
Sandbox timeout in seconds |
Proof ships with 10 built-in rules across 3 categories:
| Rule | What it catches |
|---|---|
security/no-hardcoded-secrets |
API keys, tokens, passwords in code |
security/no-dangerous-operations |
eval(), exec(), os.system(), rm -rf |
security/no-sql-injection |
SQL built with string concatenation |
| Rule | What it catches |
|---|---|
ai-mistakes/no-placeholder-values |
YOUR_API_KEY, REPLACE_ME, example.com |
ai-mistakes/no-incomplete-code |
# rest of implementation, standalone ..., pass |
ai-mistakes/no-mixed-syntax |
Python def in JS, const in Python, console.log in Python |
ai-mistakes/no-hallucinated-imports |
Commonly hallucinated package names |
| Rule | What it catches |
|---|---|
code-quality/balanced-brackets |
Unclosed (), [], {} |
code-quality/no-unused-imports |
Imported names not used in code |
code-quality/no-empty-blocks |
Empty function bodies, empty catch blocks |
import { verify, defineRule } from "proof-ai";
const noFetch = defineRule({
id: "custom/no-fetch",
name: "No fetch calls",
pattern: /\bfetch\s*\(/,
message: "Direct fetch() calls are not allowed — use the API client",
severity: "error",
suggestion: "Use apiClient.get() instead of fetch()",
});
const result = await verify({
code: myCode,
language: "typescript",
rules: [noFetch],
});List all rules from the CLI:
npx proof-ai rules listWhen Docker is available, Proof runs code with hardened security defaults:
- No network access — containers run with
--network=none - Memory limit — 256MB max (
--memory=256m) - CPU limit — 0.5 CPUs (
--cpus=0.5) - Process limit — 64 PIDs (
--pids-limit=64) - Read-only filesystem —
--read-onlywith tmpfs for/tmp - No privilege escalation —
--security-opt=no-new-privileges - Auto-cleanup — containers are created with
--rm
Network access is only enabled temporarily when install packages are specified.
name: Verify AI Code
on: [push, pull_request]
jobs:
verify:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 20
- run: npm install proof-ai
- name: Verify generated code
run: npx proof-ai check ./src/generated/*.py --json# .husky/pre-commit
npx proof-ai check ./src/generated/ --sandbox noneimport { verify } from "proof-ai";
async function generateAndVerify(prompt: string) {
const llmResponse = await callLLM(prompt);
const result = await verify({
text: llmResponse,
sandbox: "docker",
rules: "all",
});
if (!result.passed) {
// Retry, flag for review, or use a different model
console.log("Issues found:", result.issues);
return null;
}
return llmResponse;
}For serverless environments (Vercel, Cloudflare Workers) where Docker isn't available, Proof supports E2B as a cloud sandbox:
npm install @e2b/code-interpreter
export E2B_API_KEY=your_keyconst result = await verify({
code: myCode,
language: "python",
sandbox: "e2b",
});npx proof-ai doctor Proof Doctor
─────────────────────────────────
✓ Docker is available
○ E2B not configured (optional)
✓ Node.js v20.10.0
✓ Ready to verify code!
proof-ai/
├── src/
│ ├── index.ts # Public API exports
│ ├── verify.ts # Main verification pipeline
│ ├── extract.ts # Code block extraction from markdown
│ ├── syntax.ts # Offline syntax checking
│ ├── report.ts # Terminal output formatting
│ ├── cli.ts # CLI (proof verify, check, rules, doctor)
│ ├── sandbox/
│ │ ├── docker.ts # Docker sandbox (default)
│ │ ├── e2b.ts # E2B cloud sandbox (optional)
│ │ └── index.ts # Auto-detection
│ └── rules/
│ ├── engine.ts # Rule matching engine
│ ├── define.ts # defineRule() helper
│ └── builtin/ # 10 built-in rules
│ ├── security.ts
│ ├── ai-mistakes.ts
│ └── code-quality.ts
├── test/ # 72 tests
├── bin/proof.js # CLI entry point
└── package.json # 2 dependencies (chalk, commander)
We welcome contributions! See CONTRIBUTING.md for guidelines.
MIT - see LICENSE
Built by Altorlab — we use Proof in production to verify every AI-generated code snippet before it reaches our users.