Plan #607 (Beta) — Wave-pattern reliability + autonomy hardening (cc-workflow side) by bakeb7j0 · Pull Request #625 · Wave-Engineering/claudecode-workflow

bakeb7j0 · 2026-05-07T00:26:31Z

Plan #607 (Beta) — Wave-pattern reliability + autonomy hardening

13 cc-workflow issues across 5 phases, all merged into kahuna/607-beta. Companion sdlc-server kahuna→main MR for the 3 sdlc issues.

Companion

mcp-server-sdlc Plan Plan: Beta — Wave-pattern reliability + autonomy hardening (operational backlog campaign B) #607 work: sdlc#415 (wave_finalize fallback), sdlc#414 (wave_wait_for_signal), sdlc#416 (pr_wait_ci short-circuit) — separate kahuna→main MR.

Closes #607

Step 3e prompt template gains an Exit shape section (placed LAST so it is the most recent context when the agent emits its final message). Section declares the canonical line shape verbatim, lists three concrete PASS/FAIL/BLOCKED examples, and enumerates forbidden phrases — including the exact narration "Sleep is still running. Let me wait for the notification." that broke the contract on Plan #581 wave-2 flight-1. Adds polling-loop discipline (do not narrate between sleep iterations) and references the canonical-line regex so the agent can self-check. Regression coverage: TestPrimePostFlightCanonicalLineUnderLongCi in tests/test_nextwave_skill.py — 10 assertions over Exit shape position, verbatim shape, concrete examples, forbidden-phrase verbatim, polling discipline, Plan #581 reference, and regex citation. Closes #606 Co-authored-by: Baker B <bakerb@waveeng.com>

The /wavemachine outer loop sometimes stalls between waves: after `wave_complete` fires inside `/nextwave auto`, the agent occasionally emits non-canonical narrative text ("Wave N complete, starting wave N+1", etc.) instead of immediately invoking the next iteration's `wave_health_check`. This is "Bug B" from the Plan #581 campaign A debrief — distinct from the sub-agent-level Prime stall in that the OUTER LOOP itself stalls, not a delegated worker. Root cause: skill prose. The loop body's step-4 OK-path enumerated side effects in narration-friendly form ("run X, then Y, then loop back"), inviting the agent to narrate between side-effect calls. The Stop hook with `decision:block` (already in `config/settings.template.json`) catches premature TERMINATION but not in-turn narration — it fires only when the agent attempts to end a turn. Mitigation: 1. New "Wave-to-Wave Handoff" section in skills/wavemachine/SKILL.md binding the OK-path transition to a single tool-use boundary; the loop body's step-4 OK branch now defers to that section rather than enumerating side effects in prose. 2. Non-Negotiable rule added: inter-wave narration is forbidden; status-panel regen + discord-status-post + next-iteration wave_health_check ship as one tool-use block. 3. Doc-shape regression test `tests/regression/test_wavemachine_handoff_no_narrator.sh` asserts all load-bearing pieces (Handoff section, canonical wording, loop defers to it, Non-Negotiable, Stop hook + active flag in settings, no inter-wave announce/narrate instructions in loop body). Validates against existing Stop hook semantics (`lesson_stop_hook_with_block.md`): the hook is the structural safety net for premature termination; this contract is the in-turn complement preventing the narration the hook cannot see. Closes #600 Co-authored-by: Baker B <bakerb@waveeng.com>

Aggregates fragments from wave-1a flight-1 issues (#600, #606, #415). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Add a sandbox-cleanliness pre-flight at the top of /prepwaves: refuse if git status --porcelain is non-empty or HEAD is not the project's protected base branch. Refusal message lists every offending path plus the remediation menu (commit, stash, discard, checkout). --force-dirty override exists for legitimate edge cases and emits a noisy banner. Rationale: Plan #581 sandbox cross-talk incident (2026-05-05). Another agent's uncommitted work in fix/377-wave-init-base-branch-persist (~394 lines) was sitting in the same checkout when /prepwaves ran and required hand-rolled patch-and-revert to recover. Closes #603 Co-authored-by: Baker B <bakerb@waveeng.com>

Add a self-commit step to /devspec approve Step 5: after the approval-metadata block is written by devspec_approve, stage the Dev Spec file (and any auxiliary finalize-track writes), refuse on the project's protected branch, then commit on the active feature branch with title 'docs(devspec): finalize Dev Spec for Plan #N — <slug>'. Push remains the operator's affirmative act. Note in the /devspec finalize template that finalize is read-only and the commit lives in /devspec approve, since approve is the inflection point where on-disk writes occur and the doc transitions to finalized. Closes #604 Co-authored-by: Baker B <bakerb@waveeng.com>

…utput (#612) Adds step 10 to the /prepwaves procedure: after persistence and confirmation, emit a paste-ready seed for a fresh /wavemachine session with a /clear recommendation. Includes a conditional downgrade to a hint when nerf_status reports <30% of soft dart used. Documents the rationale (Plan #581 debrief context-rot) in the skill body for future-rewrite preservation. Closes #602 Co-authored-by: Baker B <bakerb@waveeng.com>

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…613) WAVE_AXIOMS.md: restructure 8 axioms into a consistent rule/why/how subsection layout, add Axiom 9 (user attention is the cost; autonomy is the protection). Axioms 1-8 numbering preserved; 9 is purely additive. Wave-pattern skill bodies (/wavemachine, /nextwave, /prepwaves, /assesswaves): each now begins with a '## Axioms' H2 cross-reference block citing the axioms binding the skill and pointing at WAVE_AXIOMS.md. Inline justification prose that duplicated the axiom corpus has been replaced with cross-references — single source of truth, no more skill-body drift. CLAUDE.md: bump 'eight axioms' to 'nine axioms' and add the Axiom 9 summary line so the load-bearing rules block matches the file it references. Tests updated: test_wavemachine_skill.py and test_nextwave_structure.sh previously asserted direct cross-references to two memory files (principle_user_attention_is_the_cost.md and principle_cost_asymmetry_continue_vs_exit.md). The structural rework routes those memory files through the axiom corpus (Axiom 9 + the file's cross-reference table), so the tests now assert the WAVE_AXIOMS.md cross-reference and the top-of-file '## Axioms' block — these transitively cover both memory files via the corpus. Closes #605 Co-authored-by: Baker B <bakerb@waveeng.com>

) Adds per-wave drift instrumentation and a system-reminder re-grounding mechanism to /wavemachine, so long campaigns (5+ waves, multi-hour wall-clock) no longer drift in agent behavior. "Bug C" from the Plan #581 campaign A debrief. Mechanism (lightest of the three options the issue evaluated): at every wave_complete boundary inside the Wave-to-Wave Handoff tool-use block, the loop emits three drift-signal events (wave_message_length_main, wave_stop_hook_blocks, wave_concerns_posts) via a new scripts/wavemachine/drift-instrumentation.sh helper, plus a system-reminder payload referencing WAVE_AXIOMS.md and explicitly citing Axiom 9 (user attention as cost). Heavyweight options (mandatory /engage, /compact-on-N-waves) are documented as rejected alternatives held in reserve for empirical escalation. The wiring is unconditional and mechanical (per Axiom 6 — agent does not add gates the user did not invoke). The system-reminder is out-of-band, so it does not violate the no-narrator-gap contract from cc-workflow#600. Tests: 15 unit tests in tests/test_drift_instrumentation_skill.py covering script shape (executable, subcommands, self-test output, input validation), report subcommand aggregation, and SKILL.md wiring (section heading, three-event references, WAVE_AXIOMS + Axiom 9 citation, rejected-alternatives subsection, handoff block wiring, non-negotiable). Self-test produces compact JSON identical in shape to mcp-log output without touching the real fleet logfile. Empirical baseline: a full A/B comparison (drift signals before/after mitigation on the same 6-wave plan) cannot run inside a single Flight context — Flights cannot run live /wavemachine campaigns. That is tracked as a follow-up empirical-comparison task; the first natural campaign of >=5 waves run after this lands provides the post-mitigation data, with Plan #581 campaign A as the pre-mitigation baseline. Closes #601 Co-authored-by: Baker B <bakerb@waveeng.com>

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Closes #579 Co-authored-by: Baker B <bakerb@waveeng.com>

…620) Closes #578 Co-authored-by: Baker B <bakerb@waveeng.com>

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Adds structured-event emission (call_start / call_complete / call_failed) to scripts/vox plus an EXIT trap for unknown_exit. Closes the observability gap where vox failures were invisible to the fleet log. - Three event junctures: call_start (after arg parse), call_complete (success), call_failed (every failure path with stable reason enum). - EXIT trap emits call_failed reason=unknown_exit if vox exits non-zero without a covered path having logged — guarantees no silent failure. - Pure-bash JSON-line appender (same wire format as docs/mcp-logging- standard.md) avoids the ~55ms-per-call jq subprocess cost of shelling out to mcp-log; instrumented overhead measured ~1ms vs baseline. - Stable reason enum: provider_missing, provider_failed, player_missing, player_failed, env_missing, network_failed, bad_args, unknown_exit. - VOX_DISABLED=1 path skips event emission (issue spec explicitly permits) so the no-op mode stays a no-op. - Behavior preserved: exit codes, stderr passthrough, audio output all unchanged. Provider stderr still streams to the user's terminal via tee while being captured for reason-classification. Pairs with #550 (precheck-skill vox-failure logging — the complementary "vox didn't run at all" half). Closes #551 Plan: #607 Co-authored-by: Baker B <bakerb@waveeng.com> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

Rewrite the /dod skill so it resolves the Plan issue and calls plan_load_dod against the Plan-issue body (the canonical, frozen tracking artifact per the Plan/Phase/Epic taxonomy lock) instead of parsing docs/*-devspec.md directly. Devspec fall-through now happens in two narrow cases: (1) when plan_load_dod returns a devspec_path AND the Plan body's checklist references the Deliverables Manifest / VRTM, and (2) legacy mode — when no Plan is resolvable AND a Dev Spec exists, the skill drops into the pre-taxonomy verification with a banner notice. Plan-id resolution order: explicit /dod check <N> argument → current branch matches kahuna/(\d+)- → most recent PR/MR with Plan: #N → clean error. plan_not_found and plan_body_invalid surface as one-line actionable messages, never stack traces. Closes #577 Plan: #607 Co-authored-by: Loomweaver <brbaker@analogic.com> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

Retire the legacy non-KAHUNA execution path from all wave-pattern skills. Kahuna sandbox is the only execution shape: every Plan bootstraps a kahuna_branch at /wavemachine launch; every Flight PR targets that branch; the four-signal trust gate at Plan completion is the sole path to the project's protected branch. Skill changes: - skills/wavemachine/SKILL.md: strip 'KAHUNA mode' / 'legacy non-KAHUNA' framing; gate is unconditional; abstract 'kahuna→main' to 'kahuna→protected-branch'; add Migration Note. - skills/nextwave/SKILL.md: kahuna_branch is required (refuse if missing, no fallback); Flight stub directive is unconditional; pr_create base is always kahuna_branch; cross-repo recipe / Flight stub use abstract phrasing for the protected branch. - skills/_shared/recipes/cross-repo-wave-orchestration.md: same edits as the recipe inline in nextwave/SKILL.md. - skills/devspec/SKILL.md: protected-branch resolution refuses on missing .claude-project.md instead of silently defaulting to main. - skills/{prepwaves,assesswaves}/SKILL.md: no edits required (already abstract; no mode-dependent output). Regression check: - scripts/ci/check-no-classic-mode.sh: greps wave-pattern skill surface for retired prose patterns ('Classic mode', 'legacy non-KAHUNA', 'KAHUNA mode', 'fall back to main', '--base main', etc.). - tests/regression/test_no_classic_mode.sh: wrapper invoked by scripts/ci/validate.sh's regression-tests pass. Test updates: - tests/test_nextwave_skill.py: TestAC4 retired (legacy non-KAHUNA path no longer exists); replaced with TestAC4_KahunaIsUnconditional asserting the new contract (refuse on missing kahuna_branch, no fallback wording, origin/<kahuna_branch> unconditional). Deferred follow-up: - Manual end-to-end on a release/<ver>-protected AnalogicDev project to confirm the integration target resolves correctly. Not feasible inside a Flight; tracked as a Plan #607 follow-up. Closes #580 Plan: #607 Co-authored-by: Baker B <bakerb@waveeng.com> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

Replace the implicit `vox ... || true` swallow shape with an instrumented pattern that captures rc + stderr and emits a `vox_invocation_failed` event to mcp.jsonl on non-zero exit. The `vox` ALWAYS-called rule and best-effort semantics are unchanged - vox failure does not block the precheck gate. The skill now documents the canonical bash pattern and a checklist status line that distinguishes vox success vs. failure visually. Pairs with cc-workflow#551 (vox-script-side instrumentation): - vox_invocation_failed (this PR) — vox didn't run / returned non-zero - call_failed (cc#551) — vox ran but provider/player failed Closes #550 Plan: #607 Co-authored-by: Baker B <bakerb@waveeng.com> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

bakeb7j0 and others added 18 commits May 6, 2026 18:34

chore(changelog): aggregate wave-1a fragments

c389194

Aggregates fragments from wave-1a flight-1 issues (#600, #606, #415). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

chore(changelog): aggregate wave-2a fragments

115b040

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

chore(changelog): aggregate wave-3a fragments

4855b49

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

feat(wave-skill): /wave routing skill wraps wave_show (#619)

f609e8a

Closes #579 Co-authored-by: Baker B <bakerb@waveeng.com>

feat(wave-watcher): standalone daemon aggregating wave-pattern state (#…

1a5a2af

…620) Closes #578 Co-authored-by: Baker B <bakerb@waveeng.com>

chore(changelog): aggregate wave-4a fragments

390acea

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

chore(changelog): aggregate wave-5a fragments

589a785

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

bakeb7j0 added this pull request to the merge queue May 7, 2026

Merged via the queue into main with commit dc25594 May 7, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Plan #607 (Beta) — Wave-pattern reliability + autonomy hardening (cc-workflow side)#625

Plan #607 (Beta) — Wave-pattern reliability + autonomy hardening (cc-workflow side)#625
bakeb7j0 merged 18 commits into
mainfrom
kahuna/607-beta

bakeb7j0 commented May 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

bakeb7j0 commented May 7, 2026