Skip to content

m9751/agent-operating-framework

Agent Operating Framework

License: MIT

An operating framework for AI coding agents, refined through ~12 months of private enterprise-sales workflow and ~6 weeks of public iteration.

Quick Start

Copy AGENT_FRAMEWORK.md to your project root as CLAUDE.md. Fill in Section 0 with your project specifics. Start a session.

cp AGENT_FRAMEWORK.md /path/to/your/project/CLAUDE.md

See guides/getting-started.md for the full adoption path.

Who This Is For

You've set up CLAUDE.md. You've built a few skills. You're using Projects Memory. But outputs are still inconsistent, the agent ignores rules under pressure, and you're manually reviewing everything.

This framework is the next step. It adds rules with documented enforcement contracts (some advisory by design), circuit breakers (stop after 3 failures), and an escalation model (advice → law → barriers) that makes your CLAUDE.md actually stick. See the rule-to-hook coverage matrix for what is system-enforced versus advisory in v1.5 — five of six rules ship with hooks; one (no-local-infrastructure) is a decision framework that is advisory by design.

If you're just getting started with Claude Code, read the beginner guides first. If you've hit the wall where your CLAUDE.md "stops working," start here.

I am… Start here
New to Claude Code guides/getting-started.md
CLAUDE.md "stopped working" guides/from-beginner-to-framework.md
Want copy-paste rules examples/claude-code-rules/
Want hooks examples/hooks/
Want the full framework file AGENT_FRAMEWORK.md
Curious what failures produced this INCIDENTS.md

What This Is

A behavioral operating system for Claude Code that combines:

  • Project identity — role, output contracts, quality criteria, session lifecycle
  • Evidence-first culture — four gates: read before touching, first-time check, evidence card, no guessing
  • Circuit breakers — three-failure stop, scope discipline, delivery protocol
  • Quality gates — verification before done, post-delivery checklist, HTML token hygiene
  • Enforcement architecture — memory (advice) → rules (law) → hooks (barriers)
  • Self-improvement — capture lessons, escalate failures, consolidate when rules accumulate

Claude Code note: The hook implementations in examples/hooks/ use Claude Code's PreToolUse/PostToolUse lifecycle. Rule prose is portable to any agent platform.

Every rule exists because its absence caused a specific, documented failure. See INCIDENTS.md for the log.

Library Contents

The Framework

Guides

Copy-Paste Rules

Individual rule files for ~/.claude/rules/ or .claude/rules/. Each absorbs multiple earlier rules into a single file with sub-gates:

Rule What It Prevents
read-before-acting.md Guessing instead of reading — 5 gates + three-failure stop
scope-discipline.md Over-engineering, unapproved dependencies, building what already exists, remediating dormant code
session-lifecycle.md Cold starts, plan-mode violations, sessions that end without auditing delivery
delivery-protocol.md Scattered deliverables, skipped checklists, token-wasteful HTML iterations
no-local-infrastructure.md Persistent agents on the user's laptop instead of cloud-hosted solutions
secure-configuration.md Config file overwrites, secrets in chat, wrong credentials on wrong system

Hook Examples

Shell scripts that enforce rules at the tool-call level — the third tier of the enforcement ladder. Copy to your hooks directory and configure in settings.json:

Hook Type What It Enforces
read-gate.sh PreToolUse hard block Blocks writes unless the target resource was read first
search-gate.sh PreToolUse hard block Blocks code creation unless a search was done first
secure-config-gate.sh PreToolUse hard block Blocks secret patterns in any tool call + Write to protected config paths
dormant-code-gate.sh CI lint hard block Rejects PRs that modify files whose every extracted symbol has zero callers elsewhere (scope-discipline Gate 5)
delivery-gate.sh PreToolUse advisory Reminds agent to log deliverables (fail-open)
focus-breadcrumb.sh UserPromptSubmit Writes a session breadcrumb when an explicit task is detected (companion to focus-confirmation-gate)
focus-confirmation-gate.sh PreToolUse advisory Warns when first Edit/Write/Bash fires with no focus breadcrumb (session-lifecycle Phase 1)
deprecated-field-gate.sh PreToolUse hard block Template for blocking writes that reference deprecated DB columns or API fields
empty-rule-body-gate.sh CI meta-hook hard block Pre-merge gate rejecting rule files < 200 bytes or missing ## Why (closes the empty-stub loophole)

See §5.3 Rule-to-Hook Coverage for which rule each hook backs and the honest enforced-vs-advisory accounting.

See examples/hooks/README.md for setup instructions and the breadcrumb pattern.

Incident Log

  • INCIDENTS.md — 33 sanitized incidents linking real failures to the rules they produced. Month-precision dates.

The Key Insight

Most agent failures come from the same root cause: acting without reading. The agent guesses a column name instead of checking the schema. It deploys with assumed config instead of reading the setup guide. It tries a fourth variation of a broken approach instead of stopping to research.

Prompt instructions are not enforcement. They are guidance. The only durable approach is an escalation ladder:

Memory (advice) → Rules (law) → Hooks (barriers)

Prose tells the model what it should do. Gates determine what it is allowed to do.

This framework's core principle: one read is worth ten guesses.

Self-Applied Measurement

This framework grades itself. An eval harness in examples/evals/ walks session handoffs from the author's own workflow and scores each one on four deterministic metrics — rule adherence, plan-delivery gap, cost, and dispatch quality — then publishes the trend to a live dashboard:

Dashboard: aof-eval.vercel.app

The harness is the framework's own credibility test: if the rules work, the scores hold. If a regression slips into v1.6, the trend line moves before anyone writes a postmortem. Numbers come from 47 real sessions backfilled at v1.5 ship (mean composite 9.20/10). Re-run manually with python -m examples.evals.run_harness whenever a new batch of sessions lands.

Author

Built and maintained by Michael Busacca — 13+ years enterprise SaaS, running AI-assisted workflows in a high-volume sales context.

Changes

See CHANGELOG.md for version history.

About

An operating framework for AI coding agents, refined through 12 months of enterprise sales workflow

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages