Skip to content

Enforce repair contract at every checkpoint#341

Open
Jhacarreiro wants to merge 2 commits into
openclaw:mainfrom
Jhacarreiro:fix/repair-contract-checkpoints-v2
Open

Enforce repair contract at every checkpoint#341
Jhacarreiro wants to merge 2 commits into
openclaw:mainfrom
Jhacarreiro:fix/repair-contract-checkpoints-v2

Conversation

@Jhacarreiro

@Jhacarreiro Jhacarreiro commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Summary

Update the repair checkpoint contract after maintainer review on #340.

This revision moves the contract from manually injected unsupported fields into the canonical fix artifact contract:

fix_artifact.repair_contract = {
  must_touch: string[],
  match: "any" | "all",
  scope: "every_checkpoint"
}

The checkpoint gate now activates only when fix_artifact.repair_contract is present. It continues to bypass uncertain/incomplete likely_files so stale planning hints do not fail-close otherwise valid repairs.

What changed from the closed #340 approach

- added repair_contract to schema/repair/codex-result.schema.json
- added prompt/output guidance for repair_contract
- added runtime validation in execute-fix validation and review-results
- added deterministic producer support where the edit surface is concrete and bounded
- replaced inline executor parsing with src/repair/repair-checkpoint-contract.ts
- changed executor status parsing to git status --porcelain=v1 -z --untracked-files=all
- added functional tests for any/all, bypass behavior, shape validation, paths with spaces, and rename destinations

This directly addresses the two maintainer blockers from #340:

1. The contract is now part of the canonical schema/output contract instead of unsupported must_touch fields.
2. The parser now consumes porcelain v1 z output, avoiding quoted path parsing problems from plain porcelain text.

Scope

schema/repair/codex-result.schema.json
src/repair/repair-checkpoint-contract.ts
src/repair/execute-fix-artifact.ts
src/repair/execute-fix-validation.ts
src/repair/review-results.ts
src/repair/lib.ts
src/repair/deterministic-automerge-result.ts
src/repair/commit-finding-intake.ts
test/repair/repair-checkpoint-contract.test.ts
test/repair/execute-fix-artifact-source.test.ts

Semantics

repair_contract omitted:
  no checkpoint contract is enforced.

repair_contract.match = "any":
  each checkpoint may commit if at least one must_touch path changed.

repair_contract.match = "all":
  each checkpoint may commit only if every must_touch path changed.

repair_contract.scope = "every_checkpoint":
  the invariant is checked before every checkpoint commit path:
  - initial
  - review-fix
  - base-sync
  - finalize

Validation

corepack pnpm run format:check
corepack pnpm run build:repair
node --test \
  test/repair/repair-checkpoint-contract.test.ts \
  test/repair/execute-fix-artifact-source.test.ts \
  test/repair/deterministic-automerge-result.test.ts \
  test/repair/commit-finding-report.test.ts
corepack pnpm run lint:repair
corepack pnpm run test:repair

Observed result:

format:check: passed
build:repair: passed
focused tests: 19/19 passed
lint:repair: 0 warnings, 0 errors
test:repair: passed

Local repair-worker dry-run proof

I ran the real dist/repair/execute-fix-artifact.js executor locally in --dry-run mode with CODEX_BIN set to a deterministic fake Codex binary. This exercises the executor, subprocess boundary, git checkout, checkpoint helper and dry-run report path without pushing or opening PRs.

The fixture uses the canonical field now accepted by the schema:

{
  "repair_contract": {
    "must_touch": ["docs/repair-contract-proof.md"],
    "match": "any",
    "scope": "every_checkpoint"
  }
}

1. Allowed checkpoint when repair_contract.must_touch is changed

Setup:

fix_artifact.repair_contract.must_touch = ["docs/repair-contract-proof.md"]
fake worker edit = docs/repair-contract-proof.md

Observed result:

exit: 0
report_status: planned
action: open_fix_pr planned
checkpoint commit: 9cdf11e6b647 test: local repair contract e2e proof
git_status after run: clean

2. Rejected checkpoint when only an off-contract file is changed

Setup:

fix_artifact.repair_contract.must_touch = ["docs/repair-contract-proof.md"]
fake worker edit = docs/off-contract-proof.md

Observed result:

exit: 1
head remained at base commit: c88270b81889
working tree: ?? docs/off-contract-proof.md

Error:

Error: repair checkpoint contract rejected initial: no required repair_contract.must_touch file changed; match=any; must_touch=docs/repair-contract-proof.md; changed_files=docs/off-contract-proof.md

Relation to closed PRs

@clawsweeper

clawsweeper Bot commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs maintainer review before merge. Reviewed June 20, 2026, 9:13 PM ET / 01:13 UTC.

Summary
Adds optional fix_artifact.repair_contract schema, prompt, and validation support, then enforces that contract before each repair checkpoint commit using porcelain v1 -z status parsing with focused tests.

Reproducibility: not applicable. this is an implementation PR for a new repair artifact contract. The contributor supplied terminal dry-run proof for the executor allow/block paths rather than a bug reproduction.

Review metrics: 2 noteworthy metrics.

  • Diff Surface: 8 files changed, +313/-6. The PR touches schema, executor, validators, prompt text, and repair tests, so it changes the repair artifact contract rather than one isolated helper.
  • Checkpoint Coverage: 4 checkpoint paths gated. Initial, review-fix, base-sync, and finalize commits now share the new enforcement path.

Root-cause cluster
Relationship: canonical
Canonical: #341
Summary: This PR is the current canonical open branch for the repair checkpoint contract work; earlier repair-contract branches are closed unmerged, and the merged rebase-conflict PR is adjacent rather than a replacement.

Members:

Proposal only: this assessment does not dispatch repair, suppress jobs, mutate sibling items, close, or merge anything.

Merge readiness
Overall: 🐚 platinum hermit
Proof: 🦞 diamond lobster
Patch quality: 🐚 platinum hermit
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • Get maintainer acceptance on the canonical repair_contract API and every-checkpoint semantics.
  • [P2] Wait for pnpm check to finish on the current head before merge.

Risk before merge

  • [P1] This changes repair executor commit behavior: a too-narrow emitted repair_contract.must_touch set can reject review-fix, base-sync, or finalize checkpoints that would otherwise carry useful repair changes.
  • [P1] At review time, pnpm check was still in progress on the current head, so merge should wait for the normal required check result.

Maintainer options:

  1. Accept Every-Checkpoint Enforcement
    Merge after CI if maintainers want any emitted repair_contract to be a hard invariant at initial, review-fix, base-sync, and finalize checkpoints.
  2. Narrow Activation First
    Before merge, restrict prompt or producer activation if maintainers want later checkpoint fixes to allow changes outside must_touch.
  3. Pause Contract Surface
    If the canonical artifact API is not ready, pause or close this branch and keep current repair checkpoint behavior unchanged.

Next step before merge

  • [P2] Human review is needed because this adds a canonical repair artifact field and executor checkpoint gate; there is no narrow automated repair blocker in the current diff.

Security
Cleared: The diff adds no dependencies, action refs, secret handling, or broader permissions; the security pass found no concrete supply-chain or security-boundary concern.

Review details

Best possible solution:

Land the canonical checkpoint contract only after maintainers accept the every-checkpoint artifact semantics and CI validates the current head; otherwise narrow when the contract is emitted or scoped.

Do we have a high-confidence way to reproduce the issue?

Not applicable; this is an implementation PR for a new repair artifact contract. The contributor supplied terminal dry-run proof for the executor allow/block paths rather than a bug reproduction.

Is this the best way to solve the issue?

Yes for the submitted direction, with maintainer approval. Moving the checkpoint invariant into fix_artifact.repair_contract is cleaner than unsupported ad hoc fields, but the permanent every-checkpoint semantics are a maintainer product/automation decision.

AGENTS.md: found and applied where relevant.

Codex review notes: model internal, reasoning high; reviewed against 1566ad6a0a5b.

Label changes

Label changes:

  • add rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🦞 diamond lobster and patch quality is 🐚 platinum hermit.
  • add status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (terminal): The PR body includes terminal-style dry-run output from the real repair executor with a deterministic fake Codex binary showing both allowed and rejected checkpoint paths.
  • remove rating: 🧂 unranked krab: Current PR rating is rating: 🐚 platinum hermit, so this older rating label is no longer current.
  • remove status: ⏳ waiting on author: Current PR status label is status: 👀 ready for maintainer look.

Label justifications:

  • P2: This is a normal-priority repair-lane automation hardening change with limited blast radius but real workflow impact.
  • merge-risk: 🚨 automation: The executor gate can cause repair/automerge jobs to fail before checkpoint commits when an emitted contract does not match the actual changed files.
  • rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🦞 diamond lobster and patch quality is 🐚 platinum hermit.
  • status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (terminal): The PR body includes terminal-style dry-run output from the real repair executor with a deterministic fake Codex binary showing both allowed and rejected checkpoint paths.
  • proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes terminal-style dry-run output from the real repair executor with a deterministic fake Codex binary showing both allowed and rejected checkpoint paths.
Evidence reviewed

What I checked:

Likely related people:

  • steipete: Git history shows the repair executor/schema files date to dc824915, and the same execute-fix-artifact.ts repair/rebase loop was recently changed by merged fix(repair): harden rebase conflict retries #332. (role: introduced behavior and recent area contributor; confidence: high; commits: dc824915bb6c, 39239e718f9d; files: src/repair/execute-fix-artifact.ts, src/repair/lib.ts, schema/repair/codex-result.schema.json)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. P2 Normal priority bug or improvement with limited blast radius. merge-risk: 🚨 automation 🚨 Merging this PR could break CI, automerge, proof capture, label sync, or automation. labels Jun 19, 2026
@Jhacarreiro Jhacarreiro force-pushed the fix/repair-contract-checkpoints-v2 branch from 05484c7 to da96bec Compare June 20, 2026 07:42
@Jhacarreiro

Copy link
Copy Markdown
Contributor Author

Updated this PR after the maintainer deep review on #340.

Changes now address the two blockers called out there:

- no unsupported top-level must_touch / must_touch_files fields;
- canonical fix_artifact.repair_contract added to schema, prompt/output contract, validators and deterministic producers;
- executor now enforces fix_artifact.repair_contract only when present;
- parser now uses git status --porcelain=v1 -z --untracked-files=all;
- functional tests cover quoted/space paths, rename destinations, any/all semantics and likely_files bypass;
- local dry-run e2e proof rerun with repair_contract, allow -> planned, block -> rejected before checkpoint.

Validation run locally:

corepack pnpm run format:check
corepack pnpm run build:repair
node --test test/repair/repair-checkpoint-contract.test.ts test/repair/execute-fix-artifact-source.test.ts test/repair/deterministic-automerge-result.test.ts test/repair/commit-finding-report.test.ts
corepack pnpm run lint:repair
corepack pnpm run test:repair

All passed.

@clawsweeper re-review

@clawsweeper

clawsweeper Bot commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

@clawsweeper clawsweeper Bot added proof: sufficient Contributor real behavior proof is sufficient. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. and removed status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. labels Jun 20, 2026
@Jhacarreiro

Copy link
Copy Markdown
Contributor Author

v3 update pushed

I updated this PR to address the maintainer blockers from #340 instead of reopening or duplicating that closed PR.

What changed:

- `must_touch` / `must_touch_files` are no longer standalone unsupported fields.
- The contract is now canonical: `fix_artifact.repair_contract` in `schema/repair/codex-result.schema.json`.
- The prompt/output contract documents when to emit it and when to omit it.
- Runtime validation now checks the contract shape in execute-fix validation and review-results.
- Deterministic producers attach the contract only when the expected edit surface is concrete and bounded.
- The executor now reads `git status --porcelain=v1 -z --untracked-files=all`.
- Contract parsing/enforcement lives in `src/repair/repair-checkpoint-contract.ts` with functional tests.

Validation run locally:

corepack pnpm run format:check
corepack pnpm run build:repair
node --test test/repair/repair-checkpoint-contract.test.ts test/repair/execute-fix-artifact-source.test.ts test/repair/deterministic-automerge-result.test.ts test/repair/commit-finding-report.test.ts
corepack pnpm run lint:repair
corepack pnpm run test:repair

Result:

format:check: passed
build:repair: passed
focused tests: 19/19 passed
lint:repair: 0 warnings, 0 errors
test:repair: passed

Local dry-run executor proof with canonical fix_artifact.repair_contract:

allow:
  exit 0
  report_status planned
  action open_fix_pr planned
  checkpoint commit 9cdf11e6b647
  git_status clean

block:
  exit 1
  head remained c88270b81889
  working tree ?? docs/off-contract-proof.md
  error: repair checkpoint contract rejected initial: no required repair_contract.must_touch file changed; match=any; must_touch=docs/repair-contract-proof.md; changed_files=docs/off-contract-proof.md

@Jhacarreiro

Copy link
Copy Markdown
Contributor Author

@clawsweeper re-review

Please review the current head da96beca9e35f42da4d7da482bd622e6ba4909e7 and updated body/comment. This revision addresses the #340 maintainer blockers by moving the checkpoint contract into canonical fix_artifact.repair_contract, schema, prompt/output contract, validators, deterministic producers, and porcelain v1 z parsing.

@Jhacarreiro

Jhacarreiro commented Jun 20, 2026

Copy link
Copy Markdown
Contributor Author

Small implementation-shape note for review: if maintainers prefer a narrower contract activation policy, I can make that adjustment in this PR without changing the overall design.

The intentionally small knobs are:

- keep `repair_contract` schema/prompt/validator/executor semantics as the canonical boundary;
- make deterministic producers more conservative, or remove producer emission entirely if maintainers prefer model/planner-authored contracts only;
- make `match` / `scope` defaults explicit in runtime while preserving the schema contract, if that better matches existing artifact conventions.

One extra architectural motivation for this shape: keeping repair_contract inside the canonical fix_artifact contract gives external repair workers/connectors a stable boundary without adding another execution path inside ClawSweeper.

That keeps the core provider-neutral: ClawSweeper owns schema, validation, git/checkpoint safety, and PR mechanics, while any external worker that can emit the canonical artifact can interoperate through the same contract. This should make the repair lane easier to extend safely without growing backend-specific logic in the core.

Those would be narrow follow-ups on the current head, not a new PR and not a redesign of the checkpoint gate.

@clawsweeper clawsweeper Bot added rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. and removed rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. labels Jun 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merge-risk: 🚨 automation 🚨 Merging this PR could break CI, automerge, proof capture, label sync, or automation. P2 Normal priority bug or improvement with limited blast radius. proof: sufficient Contributor real behavior proof is sufficient. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant