From 2ab6df3c9b3e6f599e9735180eb00f9a31551d7b Mon Sep 17 00:00:00 2001 From: Baker B Date: Sun, 3 May 2026 20:00:50 -0400 Subject: [PATCH] fix(wavemachine): call installed binaries by PATH-resolvable names MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit PR #383 introduced a regression by replacing direct state.json edits with 'python3 -m wave_status ' — the Python module form, which only resolves inside cc-workflow's source tree. Same-shape relative-path bugs crept into ./scripts/generate-status-panel and scripts/mcp-log calls, making /wavemachine non-portable across projects since 2026-04-20. The installer was always correct: it builds wave-status (zipapp), generate-status-panel, and mcp-log into ~/.local/bin/ — all PATH-resolvable in any CC session. The skill bodies just stopped calling the right names. Fix: rewrite call sites in skills/wavemachine/SKILL.md and skills/nextwave/SKILL.md to use bare installed-binary names, add a pre-flight existence check that refuses to start /wavemachine if any of the three CLIs is missing, and add two regression tests under tests/regression/ that lock in the invariant going forward. scripts/wavebus/{wave-init,flight-finalize,wave-cleanup} references in nextwave have the same shape of bug but a different install layout (subdirectory not on PATH); deliberately out of scope for #569 and called out in the regression test header so the exclusion is explicit. Closes #569 Co-Authored-By: Claude Opus 4.7 (1M context) --- skills/nextwave/SKILL.md | 8 +- skills/wavemachine/SKILL.md | 57 ++++---- .../test_skill_uses_installed_binaries.sh | 137 ++++++++++++++++++ .../test_wavemachine_preflight_tools_check.sh | 82 +++++++++++ 4 files changed, 252 insertions(+), 32 deletions(-) create mode 100755 tests/regression/test_skill_uses_installed_binaries.sh create mode 100755 tests/regression/test_wavemachine_preflight_tools_check.sh diff --git a/skills/nextwave/SKILL.md b/skills/nextwave/SKILL.md index 432255e..01c1b40 100644 --- a/skills/nextwave/SKILL.md +++ b/skills/nextwave/SKILL.md @@ -28,7 +28,7 @@ CC sub-agents do not have the `Agent`/`Task` tool. Only the top-level session ca - CI trust (auto mode): `wave_ci_trust_level` - Bus primitives (bash): `scripts/wavebus/wave-init`, `scripts/wavebus/flight-finalize`, `scripts/wavebus/wave-cleanup` - Notifications: `mcp__disc-server__disc_send` — post wave status to `#wave-status` (`1487386934094462986`) -- Observability: `scripts/mcp-log` — emit lifecycle events to `~/.claude/logs/mcp.jsonl` for temporal correlation against sdlc-server `tool_call` events. See `docs/mcp-logging-standard.md`. +- Observability: `mcp-log` — emit lifecycle events to `~/.claude/logs/mcp.jsonl` for temporal correlation against sdlc-server `tool_call` events. See `docs/mcp-logging-standard.md`. ## Concepts @@ -171,7 +171,7 @@ Run this after `wave_complete()` lands and the bus has been cleaned (Step 5). This handles the multi-session case: `/prepwaves` may have run in a different session and its scrollback is gone. The recipe content lives in one place — both skills `cat` from the same file. Single-repo waves skip this step entirely. 2. Resolve **target repo slug** for the bus path. Same-repo waves: use the current repo's slug. Cross-repo waves (wave plan lives in this repo, stories live elsewhere): use the target repo's slug per `lesson_cross_repo_wave_orchestration.md`. -3. **Emit observability anchor.** Run `scripts/mcp-log wave_start wave= target= issues=` so this anchor *precedes* every per-issue `spec_validate_structure`, `wave_show`, and `wave_previous_merged` call below — that ordering is what makes post-mortem temporal correlation work. The `kahuna` field is added later (Step 1.5) once `wave_show` has been called; it is fine for the initial `wave_start` to omit it. +3. **Emit observability anchor.** Run `mcp-log wave_start wave= target= issues=` so this anchor *precedes* every per-issue `spec_validate_structure`, `wave_show`, and `wave_previous_merged` call below — that ordering is what makes post-mortem temporal correlation work. The `kahuna` field is added later (Step 1.5) once `wave_show` has been called; it is fine for the initial `wave_start` to omit it. 4. Verify main is clean in the target repo; `wave_previous_merged()` confirms prior wave landed; `spec_validate_structure(issue)` for each issue in the wave. 5. Call `scripts/wavebus/wave-init 1`. Flight count is `1` initially — Prime may re-invoke it with the real count (script is idempotent). Capture the printed wave root. 6. **Read `kahuna_branch` from wave state** via `wave_show()` (or by reading `.claude/status/state.json` in the target repo). If the field is present and non-empty, the wave is executing under KAHUNA — capture the value (e.g. `kahuna/-`) and pass it into the Prime(pre-wave) prompt as the `kahuna_branch` input. If absent or empty, the wave is a legacy non-KAHUNA execution — flights base off `main` as before. Pre-created worktree branches (Step 7, cross-repo path) and `pr_create` `base` (Step 3e) honor this same value. See Dev Spec §5.2.3 for the authoritative contract. @@ -211,7 +211,7 @@ Prompt template (fill in ``, ``, ``, the issue list, an Orchestrator parses that JSON. If `status=BLOCKED`, print the blocker from `plan.md`, post a BLOCKED notice to `#wave-status`, exit (leave bus in place for forensics). In **auto mode**, emit the canonical status line (see "Step 6 — Final status emission") before returning control. Else continue. -If `status=READY`, for each flight in the parsed `flights` array emit `scripts/mcp-log flight_plan wave= flight= issues=` so the planned partition is recorded in the fleet logfile. This is the temporal anchor for correlating sdlc-server tool_call events to a specific flight in post-mortem. **Important:** the array must be compact (no inner spaces) — pretty-printed arrays will be word-split by the shell into broken tokens. +If `status=READY`, for each flight in the parsed `flights` array emit `mcp-log flight_plan wave= flight= issues=` so the planned partition is recorded in the fleet logfile. This is the temporal anchor for correlating sdlc-server tool_call events to a specific flight in post-mortem. **Important:** the array must be compact (no inner spaces) — pretty-printed arrays will be word-split by the shell into broken tokens. ## Step 3 — Per-flight execution loop @@ -363,7 +363,7 @@ After every flight has merged: > ``` 2. Orchestrator reads the report, surfaces a human-readable summary, and — in interactive mode — prompts: "Wave N complete. Run `/nextwave` for Wave N+1, or pause here." -3. Emit `scripts/mcp-log wave_complete wave= status= merged= deferred=` so the wave terminal state is timestamped in the fleet logfile. +3. Emit `mcp-log wave_complete wave= status= merged= deferred=` so the wave terminal state is timestamped in the fleet logfile. 4. Post to `#wave-status` (`1487386934094462986`): `"✅ **Wave complete** — , merged, deferred. Agent: **** "`, then vox announcement (conversational: name, team, project, wave, counts). 5. In **auto mode**, emit the canonical status line (see "Step 6 — Final status emission") as the final assistant message. If Prime(post-wave) returned FAIL/BLOCKED, emit that status; otherwise emit OK. diff --git a/skills/wavemachine/SKILL.md b/skills/wavemachine/SKILL.md index be95a64..8a86388 100644 --- a/skills/wavemachine/SKILL.md +++ b/skills/wavemachine/SKILL.md @@ -36,12 +36,13 @@ description: Autopilot for wave-pattern execution. Runs a top-level loop that ca Before entering the loop: -1. **Plan exists.** Call `wave_show()`. If it returns no state / empty state, refuse: "No wave plan exists. Run `/prepwaves ` first." -2. **No other wave active.** Inspect `wave_show()`'s output — if `action` is `in-flight`, `planning`, or any active state, refuse: "Wave is already active (action: ). Let it finish or clear state before starting wavemachine." -3. **Base branch clean.** `git status --porcelain` returns nothing on the configured base branch. Any untracked/modified files → refuse and list them. -4. **Previous wave merged.** Call `wave_previous_merged()`. If the prior wave's work is not on main, refuse. -5. **At least one pending wave remains.** Call `wave_next_pending()`. If null, refuse: "No pending waves. Plan is complete — run `/dod` to verify." -6. **No concurrent wavemachine.** Read `.claude/status/state.json` — if `wavemachine_active` is already `true`, refuse: "Wavemachine is already running in this project. Wait for it to complete or abort first." +1. **Supporting CLIs on PATH.** **Run this check FIRST, before any MCP calls — if it fails, stop immediately and do not proceed to items #2–7.** Run `command -v wave-status generate-status-panel mcp-log` and verify all three resolve. If any is missing, refuse with a message that names every missing CLI individually: `"/wavemachine requires on PATH. Re-run claudecode-workflow's ./install to deploy supporting tooling."` Do NOT fall back to relative paths or Python-module invocation forms — they are not portable across projects, and silent fallback hides installer regressions (this check exists because of issue #569). +2. **Plan exists.** Call `wave_show()`. If it returns no state / empty state, refuse: "No wave plan exists. Run `/prepwaves ` first." +3. **No other wave active.** Inspect `wave_show()`'s output — if `action` is `in-flight`, `planning`, or any active state, refuse: "Wave is already active (action: ). Let it finish or clear state before starting wavemachine." +4. **Base branch clean.** `git status --porcelain` returns nothing on the configured base branch. Any untracked/modified files → refuse and list them. +5. **Previous wave merged.** Call `wave_previous_merged()`. If the prior wave's work is not on main, refuse. +6. **At least one pending wave remains.** Call `wave_next_pending()`. If null, refuse: "No pending waves. Plan is complete — run `/dod` to verify." +7. **No concurrent wavemachine.** Read `.claude/status/state.json` — if `wavemachine_active` is already `true`, refuse: "Wavemachine is already running in this project. Wait for it to complete or abort first." On any refusal: explain the failure, suggest the remediation, **do not enter the loop**. @@ -52,7 +53,7 @@ The `wavemachine_active` flag in `.claude/status/state.json` is the signal the s **On entry (after pre-flight passes, before the first iteration):** ``` -python3 -m wave_status wavemachine-start --launcher main +wave-status wavemachine-start --launcher main ``` Writes `wavemachine_active: true`, `wavemachine_started_at`, and `wavemachine_launcher=main` into `state.json`. The CLI guarantees atomic writes via `save_json`. Do **not** `Edit` `state.json` directly — always go through this CLI. @@ -60,7 +61,7 @@ Writes `wavemachine_active: true`, `wavemachine_started_at`, and `wavemachine_la **On EVERY exit path (happy completion, circuit breaker trip, per-wave BLOCKED/FAIL, user interrupt, tool denial, unexpected error):** ``` -python3 -m wave_status wavemachine-stop +wave-status wavemachine-stop ``` Clears `wavemachine_active` and the related metadata. Idempotent — safe to call even if already cleared. Treat this as a `finally` clause around the whole loop: no codepath out of `/wavemachine` may skip it. @@ -73,7 +74,7 @@ Once pre-flight passes: 2. **Regenerate and open the status panel — REQUIRED.** This step is **not optional** — the panel is the operator's primary visibility surface and MUST be open before the loop starts. Run the exact invocation: ```bash - ./scripts/generate-status-panel + generate-status-panel xdg-open .status-panel.html # Linux; substitute `open` on macOS ``` @@ -81,15 +82,15 @@ Once pre-flight passes: 3. **Detect CI trust** by calling `wave_ci_trust_level()` once. (The value is also cached by each `/nextwave auto` iteration; calling it here is informational — it shapes the start announcement.) 4. **Pre-wave kahuna bootstrap** (see "Pre-Wave Kahuna Bootstrap" below). Runs exactly once per Plan on first `/wavemachine` invocation. On resume invocations the wave state already carries `kahuna_branch` and this step is a no-op. 5. **Post to Discord.** `disc_send` to `#wave-status` (`1487386934094462986`): `"🌊 **Wavemachine started** — , waves pending. Agent: **** "`. Resolve identity from `/tmp/claude-agent-.json`. If `disc_send` fails, log and continue — Discord is informational, not a gate. -6. **Emit observability event.** `scripts/mcp-log wavemachine_start plan= waves= kahuna=` — timestamps autopilot start in the fleet logfile so post-mortem can correlate with sdlc-server tool_call events. +6. **Emit observability event.** `mcp-log wavemachine_start plan= waves= kahuna=` — timestamps autopilot start in the fleet logfile so post-mortem can correlate with sdlc-server tool_call events. ## Status Panel Lifecycle -`.status-panel.html` is a **point-in-time snapshot** of wave state, not a live dashboard. The generator (`scripts/generate-status-panel`) reads `.claude/status/{phases-waves,state,flights}.json`, renders a self-contained HTML file (inline CSS/JS, no external deps), and exits. There is **no** JavaScript polling, **no** WebSocket, **no** background refresher — the open browser tab will keep displaying the snapshot from whenever the file was last written until the operator hits Cmd-R / F5 (or the file is regenerated and the operator refreshes). +`.status-panel.html` is a **point-in-time snapshot** of wave state, not a live dashboard. The generator (`generate-status-panel`) reads `.claude/status/{phases-waves,state,flights}.json`, renders a self-contained HTML file (inline CSS/JS, no external deps), and exits. There is **no** JavaScript polling, **no** WebSocket, **no** background refresher — the open browser tab will keep displaying the snapshot from whenever the file was last written until the operator hits Cmd-R / F5 (or the file is regenerated and the operator refreshes). This is a deliberate design choice: a static file has zero runtime cost, no auth surface, and can be opened from anywhere — including after the wavemachine session has exited. The cost is that the panel is only as fresh as the last `generate-status-panel` invocation. -**Auto-regeneration policy.** To keep the panel close to live without introducing a polling loop, `/wavemachine` re-invokes `./scripts/generate-status-panel` at the two lifecycle events that change wave state most visibly: +**Auto-regeneration policy.** To keep the panel close to live without introducing a polling loop, `/wavemachine` re-invokes `generate-status-panel` at the two lifecycle events that change wave state most visibly: 1. **After every `wave_complete` event** — i.e. immediately after `/nextwave auto` returns OK and before the loop iterates. The wave's `done` action and any merged-MR metadata are now in `state.json`; regeneration reflects them. 2. **After every `wave_flight_done` event** — i.e. as flights complete within an in-flight wave. This is owned by `/nextwave auto`'s flight-completion handler (the loop body here does not see individual flight completions), so the regeneration call lives there. `/wavemachine` is responsible only for the post-`wave_complete` regeneration in its own loop body; the flight-grain regeneration is `/nextwave`'s contract. @@ -126,7 +127,7 @@ loop: 1. health = wave_health_check() if health.status != "HEALTHY": announce abort (see "On Circuit-Breaker Trip") - python3 -m wave_status wavemachine-stop + wave-status wavemachine-stop exit loop 2. next = wave_next_pending() @@ -149,7 +150,7 @@ loop: status JSON: {"status": "OK" | "BLOCKED" | "FAIL", "wave_id": "", ...} - - "OK" → run `./scripts/generate-status-panel` (fire-and-forget; + - "OK" → run `generate-status-panel` (fire-and-forget; auto-regen on wave_complete per "Status Panel Lifecycle"), then loop back to step 1 - "BLOCKED" → stop; announce abort with the blocker detail @@ -202,7 +203,7 @@ The loop exits cleanly when any of the following happens: - **Delete the kahuna branch** from the platform (per R-03). On GitHub: `gh api -X DELETE repos///git/refs/heads/` (or equivalent). On GitLab: `glab api -X DELETE projects/:id/repository/branches/`. - **Emit `#wave-status` notification** (R-19): `"✅ **Kahuna gate passed** — , Plan # auto-merged to main. flights, commits. Agent: **** "`. - **Vox announcement** (conversational, brief): name, team, project, "kahuna gate passed, Plan merged to main". - - Then fall through to the standard "On Clean Completion" announcement and `wave_status wavemachine-stop`. + - Then fall through to the standard "On Clean Completion" announcement and `wave-status wavemachine-stop`. 7. **Any-red path** (one or more signals fail): - Transition wave state `action` → `gate_blocked`, recording each failing signal's name + detail payload (so the dashboard's signal-failure detail block can render — §5.2.5). @@ -210,7 +211,7 @@ The loop exits cleanly when any of the following happens: - **Emit `#wave-status` notification** per Procedure C, §4.4.4: Plan name, each failing signal's name + short detail, kahuna branch name, the open kahuna→main MR URL. - **Vox announcement**: "Kahuna gate blocked for Plan . signals red. Ready for your review." - Call `wave_waiting("kahuna gate blocked: ")` so the plan is explicitly marked paused. - - `wave_status wavemachine-stop` and exit the loop. + - `wave-status wavemachine-stop` and exit the loop. ### PROBE_UNAVAILABLE handling @@ -271,9 +272,9 @@ Users can stop `/wavemachine` three ways: In every interrupt case: -- Run `python3 -m wave_status wavemachine-stop` immediately to clear the flag (the statusline must match reality). +- Run `wave-status wavemachine-stop` immediately to clear the flag (the statusline must match reality). - **Leave the in-flight wave's bus tree in place** (`/tmp/wavemachine//wave-/`). Do NOT call `wave-cleanup` on an interrupted wave — the partial state is forensic evidence for the human. -- Regenerate `.status-panel.html` synchronously before announcing so the attachment captures the interrupted state: `./scripts/generate-status-panel`. +- Regenerate `.status-panel.html` synchronously before announcing so the attachment captures the interrupted state: `generate-status-panel`. - Announce the interrupt to `#wave-status` (`1487386934094462986`) with the panel attached: `disc_send(channel_id="1487386934094462986", message="⏸ **Wavemachine interrupted** — , wave mid-flight, bus preserved at . Agent: **** ", attach_path=".status-panel.html")`. - Report to the user: which wave was interrupted, the bus root path, what was merged successfully before the interrupt, how to resume (re-run `/wavemachine` after reviewing the bus). @@ -291,7 +292,7 @@ Preserve the v1 announcement surface — one Discord post per lifecycle event, o The non-terminal events (`wavemachine-started`, intermediate `wave_complete` text posts) do NOT attach the HTML — they fire too frequently and the panel is more useful as the closing-frame snapshot. If `disc_send` fails to upload the attachment (network, permissions, Discord transient), the text portion still posts and the failure is logged — Discord is informational, never a gate. -Before each terminal-event `disc_send` that includes `attach_path`, make sure the file is fresh: re-invoke `./scripts/generate-status-panel` synchronously so the attachment captures the actual moment of completion/abort, not a stale snapshot from before the terminal transition. +Before each terminal-event `disc_send` that includes `attach_path`, make sure the file is fresh: re-invoke `generate-status-panel` synchronously so the attachment captures the actual moment of completion/abort, not a stale snapshot from before the terminal transition. **On loop start** (see "Launch Sequence" above): @@ -301,31 +302,31 @@ Before each terminal-event `disc_send` that includes `attach_path`, make sure th - This announcement runs AFTER the trust-score gate's all-green path (see "Trust-Score Gate and Auto-Merge"). In KAHUNA mode, the gate has already auto-merged kahuna→main and posted its own `✅ **Kahuna gate passed**` notification — this announcement closes out the wavemachine session. - In legacy non-KAHUNA mode (no `kahuna_branch` in wave state), the gate is skipped and this announcement runs directly when `wave_next_pending()` returns null. -- Regenerate `.status-panel.html` synchronously before posting so the attachment is current: `./scripts/generate-status-panel`. +- Regenerate `.status-panel.html` synchronously before posting so the attachment is current: `generate-status-panel`. - Discord `#wave-status`: `disc_send(channel_id="1487386934094462986", message="✅ **Wavemachine complete** — , all waves merged. Run /dod to verify. Agent: **** ", attach_path=".status-panel.html")` -- `scripts/mcp-log wavemachine_complete plan= status=OK waves_merged=` +- `mcp-log wavemachine_complete plan= status=OK waves_merged=` - Vox (conversational, brief): name, team, project, "wavemachine complete, all waves merged". **On gate-blocked completion** (KAHUNA mode, one or more trust signals failed): -- Regenerate `.status-panel.html` synchronously before posting so the attachment captures the gate-blocked state: `./scripts/generate-status-panel`. +- Regenerate `.status-panel.html` synchronously before posting so the attachment captures the gate-blocked state: `generate-status-panel`. - Per Procedure C / "Trust-Score Gate and Auto-Merge" any-red path: `disc_send(channel_id="1487386934094462986", message="🛑 **Kahuna gate blocked** — , Plan #: . MR open for review. Agent: **** ", attach_path=".status-panel.html")` - Vox alert: "Kahuna gate blocked for Plan . signals red. Ready for your review." -- `scripts/mcp-log --level warn wavemachine_complete plan= status=BLOCKED reason="kahuna gate blocked: "` +- `mcp-log --level warn wavemachine_complete plan= status=BLOCKED reason="kahuna gate blocked: "` - `wave_waiting("kahuna gate blocked: ")` so the plan is explicitly marked paused. **On circuit-breaker trip** (`wave_health_check` non-HEALTHY): -- Regenerate `.status-panel.html` synchronously before posting: `./scripts/generate-status-panel`. +- Regenerate `.status-panel.html` synchronously before posting: `generate-status-panel`. - Discord `#wave-status`: `disc_send(channel_id="1487386934094462986", message="🛑 **Wavemachine aborted (circuit breaker)** — : . Agent: **** ", attach_path=".status-panel.html")` -- `scripts/mcp-log --level error wavemachine_complete plan= status=ABORTED reason="circuit breaker: "` +- `mcp-log --level error wavemachine_complete plan= status=ABORTED reason="circuit breaker: "` - Call `wave_waiting("wavemachine aborted (circuit breaker): ")` so the plan is explicitly marked paused. **On per-wave BLOCKED or FAIL** (from `/nextwave auto` return): -- Regenerate `.status-panel.html` synchronously before posting: `./scripts/generate-status-panel`. +- Regenerate `.status-panel.html` synchronously before posting: `generate-status-panel`. - Discord `#wave-status`: `disc_send(channel_id="1487386934094462986", message="🛑 **Wavemachine aborted** — , wave : . Agent: **** ", attach_path=".status-panel.html")` -- `scripts/mcp-log --level error wavemachine_complete plan= status=ABORTED wave= reason=""` +- `mcp-log --level error wavemachine_complete plan= status=ABORTED wave= reason=""` - Call `wave_waiting("wavemachine aborted: ")`. **On user interrupt** (see "Interrupt Handling" above). @@ -396,7 +397,7 @@ See memory files `principle_user_attention_is_the_cost.md` and `principle_cost_a ## Non-Negotiables - **One plan at a time.** Pre-flight refuses to start if another wavemachine is active or another wave is in-flight. -- **`wavemachine_active` flag must always reflect reality.** Set on entry via `wave_status wavemachine-start`; unset on EVERY exit path via `wave_status wavemachine-stop`. No `Edit` to `state.json`. The statusline 🌊 indicator is not allowed to lie. +- **`wavemachine_active` flag must always reflect reality.** Set on entry via `wave-status wavemachine-start`; unset on EVERY exit path via `wave-status wavemachine-stop`. No `Edit` to `state.json`. The statusline 🌊 indicator is not allowed to lie. - **NEVER run the loop in a background sub-agent.** No background Agent invocation, ever — not with the `run_in_background` parameter, not shelled out, not via any other escape hatch. The loop is top-level, period. (The gate's `feature-dev:code-reviewer` Agent runs *synchronously* at the top level — not in the background.) - **NEVER spawn Flights or Prime directly.** `/nextwave auto` owns the Orchestrator/Prime/Flight protocol for each wave — `/wavemachine` only delegates wave work to it. - **Circuit breaker before every iteration.** `wave_health_check` is called at the TOP of each loop iteration, not just the first. diff --git a/tests/regression/test_skill_uses_installed_binaries.sh b/tests/regression/test_skill_uses_installed_binaries.sh new file mode 100755 index 0000000..993bf5a --- /dev/null +++ b/tests/regression/test_skill_uses_installed_binaries.sh @@ -0,0 +1,137 @@ +#!/usr/bin/env bash +# test_skill_uses_installed_binaries.sh — regression test for issue #569. +# +# /wavemachine and /nextwave SKILL.md must invoke their supporting tooling +# via PATH-resolvable names of the installed binaries: +# +# wave-status (zipapp at ~/.local/bin/wave-status) +# generate-status-panel (script at ~/.local/bin/generate-status-panel) +# mcp-log (script at ~/.local/bin/mcp-log) +# +# Forbidden patterns — these only resolve from inside the cc-workflow source +# tree and break the skill on every other project (regression introduced by +# PR #383, fixed by #569): +# +# python3 -m wave_status (Python module form — needs src/ on PYTHONPATH) +# ./scripts/generate-status-panel (relative path — needs cwd=cc-workflow root) +# scripts/generate-status-panel (same, no leading ./) +# scripts/mcp-log (relative path) +# wave_status (orphan: dropped python3 -m prefix) +# +# This test scans both skill bodies and fails if any forbidden pattern is +# present. It is the locked-in invariant that prevents the regression from +# coming back via a future "convenience" edit. +# +# Out of scope (DELIBERATELY NOT covered here): +# scripts/wavebus/{wave-init,flight-finalize,wave-cleanup} +# These have the same shape of regression (relative paths into cc-workflow's +# scripts/ tree) and appear in skills/nextwave/SKILL.md, but the wavebus +# install layout is different — those scripts ship under +# `~/.local/bin/wavebus/` (a subdirectory NOT directly on PATH), so a +# simple bare-name rename is insufficient. A separate issue should track +# either flattening the install layout or rewriting the call sites to use +# absolute paths. When that work happens, this test should grow a fifth +# pattern. Until then, do NOT add wavebus checks here — passing this test +# does NOT mean nextwave is fully portable, only that the three CLIs in +# scope (#569) are clean. + +set -uo pipefail + +REPO_DIR="$(cd "$(dirname "$0")/../.." && pwd)" +WAVEMACHINE="$REPO_DIR/skills/wavemachine/SKILL.md" +NEXTWAVE="$REPO_DIR/skills/nextwave/SKILL.md" + +FAILS=0 +fail() { + echo " [FAIL] $*" + FAILS=$((FAILS + 1)) +} +pass() { echo " [PASS] $*"; } + +echo "test_skill_uses_installed_binaries (#569)" +echo "──────────────────────────────────────────" + +for f in "$WAVEMACHINE" "$NEXTWAVE"; do + if [[ ! -f "$f" ]]; then + fail "skill body not found: $f" + continue + fi +done +[[ "$FAILS" -gt 0 ]] && exit 1 + +# Pattern 1: python3 -m wave_status — both files +for f in "$WAVEMACHINE" "$NEXTWAVE"; do + if grep -q "python3 -m wave_status" "$f"; then + fail "$(basename "$f"): contains forbidden 'python3 -m wave_status' (use 'wave-status' instead)" + grep -n "python3 -m wave_status" "$f" | sed 's/^/ /' + else + pass "$(basename "$f"): no 'python3 -m wave_status'" + fi +done + +# Pattern 2: ./scripts/generate-status-panel and scripts/generate-status-panel +for f in "$WAVEMACHINE" "$NEXTWAVE"; do + # Match scripts/generate-status-panel with or without leading ./ + if grep -qE "(\./)?scripts/generate-status-panel" "$f"; then + fail "$(basename "$f"): contains forbidden 'scripts/generate-status-panel' path (use 'generate-status-panel' instead)" + grep -nE "(\./)?scripts/generate-status-panel" "$f" | sed 's/^/ /' + else + pass "$(basename "$f"): no relative-path generate-status-panel" + fi +done + +# Pattern 3: scripts/mcp-log +for f in "$WAVEMACHINE" "$NEXTWAVE"; do + if grep -q "scripts/mcp-log" "$f"; then + fail "$(basename "$f"): contains forbidden 'scripts/mcp-log' path (use 'mcp-log' instead)" + grep -n "scripts/mcp-log" "$f" | sed 's/^/ /' + else + pass "$(basename "$f"): no 'scripts/mcp-log'" + fi +done + +# Pattern 4: orphan wave_status (underscore, used as a command — no python3 -m +# prefix, no surrounding identifier characters). Match the substring inside +# backticks where the skill bodies present commands, since plain prose like +# "the wave_status CLI" would false-positive otherwise. +# +# We look for two call-site forms: +# (a) backtick-inline: `wave_status ` (most common) +# (b) fenced-code-block: a line starting with `wave_status ` +# inside a ``` ... ``` block (escape route #1 noticed in code review) +for f in "$WAVEMACHINE" "$NEXTWAVE"; do + # (a) `wave_status ` inside backticks + if grep -qE '`wave_status [a-z]' "$f"; then + fail "$(basename "$f"): contains backtick-bounded 'wave_status ' (use 'wave-status' with hyphen)" + grep -nE '`wave_status [a-z]' "$f" | sed 's/^/ /' + # (b) line starting with `wave_status ` (fenced code block content) + elif grep -qE '^wave_status [a-z]' "$f"; then + fail "$(basename "$f"): contains fenced-block 'wave_status ' (use 'wave-status' with hyphen)" + grep -nE '^wave_status [a-z]' "$f" | sed 's/^/ /' + else + pass "$(basename "$f"): no orphan wave_status command form" + fi +done + +# Positive checks — confirm the installed-binary names ARE present where +# expected. (Defensive: catches a "mass deletion" regression.) +if ! grep -qE '\bwave-status\b' "$WAVEMACHINE"; then + fail "wavemachine: no 'wave-status' references — did the skill lose its CLI calls entirely?" +fi +if ! grep -qE '\bgenerate-status-panel\b' "$WAVEMACHINE"; then + fail "wavemachine: no 'generate-status-panel' references" +fi +if ! grep -qE '\bmcp-log\b' "$WAVEMACHINE"; then + fail "wavemachine: no 'mcp-log' references" +fi +if ! grep -qE '\bmcp-log\b' "$NEXTWAVE"; then + fail "nextwave: no 'mcp-log' references" +fi + +echo "" +if [[ "$FAILS" -gt 0 ]]; then + echo " $FAILS failure(s)" + exit 1 +fi +echo " all checks passed" +exit 0 diff --git a/tests/regression/test_wavemachine_preflight_tools_check.sh b/tests/regression/test_wavemachine_preflight_tools_check.sh new file mode 100755 index 0000000..daa7195 --- /dev/null +++ b/tests/regression/test_wavemachine_preflight_tools_check.sh @@ -0,0 +1,82 @@ +#!/usr/bin/env bash +# test_wavemachine_preflight_tools_check.sh — regression test for issue #569. +# +# /wavemachine SKILL.md must instruct the agent to run a pre-flight existence +# check on its supporting CLIs (wave-status, generate-status-panel, mcp-log) +# and refuse to start if any is missing. This is the encoded form of BJ's +# principle: "any script or binary that an MCP needs should be installable +# somewhere any project can find and execute it." +# +# This test is intentionally a doc-shape check, not a runtime test — the +# pre-flight is behavioural instruction the agent follows, not Bash code we +# can directly execute. The check verifies the skill body still tells the +# agent to do the right thing. If a future edit silently drops the check, +# this test fails before merge. + +set -uo pipefail + +REPO_DIR="$(cd "$(dirname "$0")/../.." && pwd)" +SKILL="$REPO_DIR/skills/wavemachine/SKILL.md" + +FAILS=0 +fail() { + echo " [FAIL] $*" + FAILS=$((FAILS + 1)) +} +pass() { echo " [PASS] $*"; } + +echo "test_wavemachine_preflight_tools_check (#569)" +echo "──────────────────────────────────────────" + +if [[ ! -f "$SKILL" ]]; then + fail "skill body not found: $SKILL" + exit 1 +fi + +# 1. The pre-flight section must contain a `command -v` probe for ALL THREE +# supporting CLIs in a single invocation (single command-v line — so a +# partial probe missing one tool wouldn't slip past). +if grep -qE "command -v[^\"]*\bwave-status\b[^\"]*\bgenerate-status-panel\b[^\"]*\bmcp-log\b" "$SKILL"; then + pass "pre-flight has 'command -v wave-status generate-status-panel mcp-log' probe" +else + fail "pre-flight missing combined 'command -v wave-status generate-status-panel mcp-log' probe" +fi + +# 2. The pre-flight must explicitly mention refusing/aborting on missing CLIs. +# Tolerant phrasing: any of "refuse", "abort", "do not enter the loop", or +# similar in the same paragraph as the command-v line. +if awk ' + /^## Pre-Flight Checks/,/^## /{ + # Capture the section block + print + } +' "$SKILL" | grep -qiE "refuse|abort|do not enter the loop"; then + pass "pre-flight section uses refuse/abort wording" +else + fail "pre-flight section missing refuse/abort wording — agent may not know to halt" +fi + +# 3. The refusal message template must reference the cc-workflow installer, +# so the agent's error output points the operator at the fix. +if grep -qE "claudecode-workflow.*install|\./install" "$SKILL"; then + pass "refusal references the installer remediation path" +else + fail "no installer-remediation pointer — refusal won't tell operator how to fix" +fi + +# 4. Defensive: the skill must NOT advise PYTHONPATH hacks or 'pip install -e' +# as an alternative to using the installed binaries. The whole point of +# the fix is that the binaries on PATH are the contract. +if grep -qiE "PYTHONPATH=src|pip install -e.*wave_status" "$SKILL"; then + fail "skill suggests PYTHONPATH/pip-install-e workarounds — defeats the portable-CLI contract" +else + pass "skill does not suggest PYTHONPATH/pip-install-e workarounds" +fi + +echo "" +if [[ "$FAILS" -gt 0 ]]; then + echo " $FAILS failure(s)" + exit 1 +fi +echo " all checks passed" +exit 0