Skip to content

fix(proof): decouple supplied label from nudges#319

Merged
steipete merged 2 commits into
openclaw:mainfrom
hannesrudolph:codex/decouple-pr-context-proof
Jun 19, 2026
Merged

fix(proof): decouple supplied label from nudges#319
steipete merged 2 commits into
openclaw:mainfrom
hannesrudolph:codex/decouple-pr-context-proof

Conversation

@hannesrudolph

Copy link
Copy Markdown
Member

What Problem This Solves

OpenClaw's PR body gate is being narrowed to author-supplied problem context and validation evidence. ClawSweeper still owns the stronger real-behavior-proof decision, but its proof-nudge and dashboard code treated the old proof: supplied label as a proof state.

Why This Change Was Made

Retire proof: supplied from ClawSweeper's proof-nudge and dashboard policy. Proof nudges now depend on ClawSweeper's reviewed report and skip only proof: sufficient or proof: override, while replacement PR label copying also filters OpenClaw's new triage: needs-pr-context hygiene label.

User Impact

Maintainers keep one clear proof signal: proof: sufficient means ClawSweeper judged the exact-head proof acceptable. Contributor PR body context can improve review readability without suppressing proof nudges or changing ClawSweeper's sufficiency gate.

Evidence

@clawsweeper

clawsweeper Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs maintainer review before merge. Reviewed June 18, 2026, 4:26 PM ET / 20:26 UTC.

Summary
The PR removes proof: supplied from proof-nudge and dashboard proof-state handling, makes contributor proof nudges depend on the reviewed report rather than the old live needs-proof label, filters triage: needs-pr-context from replacement PR labels, and updates docs/tests.

Reproducibility: yes. for the current behavior being changed: source inspection shows current main skips nudges on proof: supplied and requires the live needs-proof label. This is not a runtime bug reproduction; it is a policy cleanup with a clear source path.

Review metrics: 2 noteworthy metrics.

  • Changed surface: 7 files changed, 26 additions, 56 deletions. The diff is small but spans implementation, dashboard policy, replacement-label filtering, tests, and docs.
  • Proof triage buckets: 1 proof dashboard view removed and 1 bucket renamed. Maintainers should notice the dashboard semantics change before merge.

Merge readiness
Overall: 🐚 platinum hermit
Proof: 🌊 off-meta tidepool
Patch quality: 🐚 platinum hermit
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • [P2] Consider a targeted dry-run against representative proof-blocked PR records before enabling or trusting scheduled proof nudges with the new policy.

Risk before merge

  • [P1] Merging changes contributor proof nudges from a live-label-and-report gate to a report-driven gate; if maintainers still use removal of triage: needs-real-behavior-proof as a pause signal, this would allow reminders unless another skip condition applies.
  • [P1] The proof dashboard will stop treating standalone proof: supplied as a proof-triage signal, so any old supplied-only labels will no longer appear in that dashboard bucket.

Maintainer options:

  1. Accept report-driven proof nudges (recommended)
    Merge once maintainers confirm that proof: supplied and the old live needs-proof label should no longer suppress report-backed proof reminders.
  2. Preserve a manual label pause
    If removing the live needs-proof label still needs to pause reminders, keep that label prerequisite while removing only proof: supplied from sufficiency-style states.

Next step before merge

  • [P2] Maintainer-authored draft PR changes proof automation policy, so the remaining action is human review of the intended label semantics rather than autonomous repair.

Security
Cleared: No concrete security or supply-chain concern was found; the patch changes proof-label policy, docs, and tests without new dependencies, permissions, secrets, or code download paths.

Review details

Best possible solution:

Land the focused cleanup only after maintainers confirm that ClawSweeper's reviewed report plus proof: sufficient or proof: override are the intended proof-nudge control signals.

Do we have a high-confidence way to reproduce the issue?

Yes for the current behavior being changed: source inspection shows current main skips nudges on proof: supplied and requires the live needs-proof label. This is not a runtime bug reproduction; it is a policy cleanup with a clear source path.

Is this the best way to solve the issue?

Yes, conditionally: the patch is the narrowest implementation if maintainers want the reviewed ClawSweeper report to own proof-nudge eligibility. If label removal should remain a manual pause, the safer path is to keep the live-label prerequisite and only retire proof: supplied.

AGENTS.md: found and applied where relevant.

Codex review notes: model internal, reasoning high; reviewed against a6ee02e323c1.

Label changes

Label changes:

  • add P2: This is a normal-priority proof automation policy fix with limited but real impact on maintainer PR review workflows.
  • add merge-risk: 🚨 automation: The diff changes when proof nudges and proof dashboard buckets fire, so a policy mismatch could produce incorrect proof automation behavior after merge.
  • add rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🌊 off-meta tidepool and patch quality is 🐚 platinum hermit.
  • add status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Not applicable: The real behavior proof gate does not apply because this is a maintainer-authored draft PR.

Label justifications:

  • P2: This is a normal-priority proof automation policy fix with limited but real impact on maintainer PR review workflows.
  • merge-risk: 🚨 automation: The diff changes when proof nudges and proof dashboard buckets fire, so a policy mismatch could produce incorrect proof automation behavior after merge.
  • rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🌊 off-meta tidepool and patch quality is 🐚 platinum hermit.
  • status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Not applicable: The real behavior proof gate does not apply because this is a maintainer-authored draft PR.
Evidence reviewed

What I checked:

  • Repository policy read: AGENTS.md was read fully; its non-closeable maintainer-authored item rule and conservative automation-review guidance apply to this PR. (AGENTS.md:1, a6ee02e323c1)
  • Current proof-nudge behavior: Current main requires the live triage: needs-real-behavior-proof label and skips proof nudges when proof: supplied is present. (src/clawsweeper.ts:11665, a6ee02e323c1)
  • PR changes proof-nudge gates: The PR removes the live needs-proof label prerequisite and removes proof: supplied from contributor and bot proof-policy skip checks. (src/clawsweeper.ts:11659, 74ee66b8959c)
  • Current dashboard proof state: Current main includes proof: supplied in proof-dashboard label queries and exposes a separate supplied-awaiting-review state. (dashboard/worker.ts:133, a6ee02e323c1)
  • PR changes dashboard buckets: The PR removes proof: supplied from proof label discovery, removes the supplied-awaiting-review view, and renames the missing-proof bucket to needs-proof-review. (dashboard/worker.ts:133, 74ee66b8959c)
  • Replacement label behavior: Current main filters triage: needs-real-behavior-proof from replacement PR labels; the PR extends that filter to triage: needs-pr-context. (src/repair/replacement-labels.ts:48, a6ee02e323c1)

Likely related people:

  • brokemac79: GitHub PR history shows this person introduced proof-nudge reminders in Add proof nudge reminders for PRs missing behavior proof #130 and the PR proof triage dashboard in Add read-only PR proof triage dashboard #118. (role: original proof automation contributor; confidence: high; commits: 6920996b6ca1, f16768ab215d; files: src/clawsweeper.ts, docs/proof-nudges.md, dashboard/worker.ts)
  • steipete: GitHub commit history shows repeated recent proof-nudge hardening and dashboard work, and local blame at the release boundary attributes the current proof-nudge/dashboard lines to the v0.3.0 release commit. (role: recent proof-nudge and dashboard area contributor; confidence: high; commits: dc824915bb6c, 7c9edf1808b8, 2a4901a0cf9e; files: src/clawsweeper.ts, dashboard/worker.ts, docs/proof-nudges.md)
  • Dallin Romney: GitHub commit history for src/repair/replacement-labels.ts shows this person added the inherited replacement label filtering that this PR extends. (role: replacement label filter contributor; confidence: medium; commits: b1615df6647b; files: src/repair/replacement-labels.ts)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. P2 Normal priority bug or improvement with limited blast radius. merge-risk: 🚨 automation 🚨 Merging this PR could break CI, automerge, proof capture, label sync, or automation. labels Jun 18, 2026
@steipete steipete marked this pull request as ready for review June 19, 2026 11:38
@steipete steipete force-pushed the codex/decouple-pr-context-proof branch from 74ee66b to 825de0f Compare June 19, 2026 11:38
@steipete steipete merged commit bbdd097 into openclaw:main Jun 19, 2026
6 checks passed
@steipete

Copy link
Copy Markdown
Contributor

Landed after maintainer policy approval and verification.

Proof:

  • Rebased onto current main; moved policy assertions into the focused test files introduced by test: split prompt proof label policy coverage #327.
  • Added focused coverage for filtering triage: needs-pr-context from replacement PR labels.
  • Structured Codex autoreview against origin/main: clean, no accepted/actionable findings.
  • Local build, source/repair/script/dashboard lint, and 88 focused proof/dashboard/replacement-label tests: passed.
  • Fresh GitHub pnpm check, CodeQL, and Windows launcher checks: passed.

Landed commit: bbdd0971612ada081343d84c18304e2e87f6cae8

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merge-risk: 🚨 automation 🚨 Merging this PR could break CI, automerge, proof capture, label sync, or automation. P2 Normal priority bug or improvement with limited blast radius. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants