Skip to content

Plan: Wave-pattern test harness foundation #641

@bakeb7j0

Description

@bakeb7j0

Plan: Wave-pattern test harness foundation

Goal

Ship a runnable nightly + on-demand test harness for the wave-pattern pipeline that exercises Tier 1 (unit) and Tier 4 (e2e campaign) coverage end-to-end across GitHub and GitLab fixtures, closing the campaign-as-test-of-last-resort gap surfaced by Plan #607 (Beta).

Scope

In scope

  • New repo Wave-Engineering/ccwork-testtarget with Python + pytest runner, fine-grained PAT auth, teardown helpers.
  • Tier 0 static checks: required CLIs on PATH, MCP tools/list snapshots, skill frontmatter parse, MEMORY.md integrity, WAVE_AXIOMS.md presence and structure.
  • Tier 1 unit wiring: existing bun test suite for each of four MCP servers (mcp-server-sdlc, mcp-server-discord, mcp-server-nerf, mcp-server-wtf); bus-script unit tests in tmpdir for wave-init, flight-finalize, changelog-aggregate, wave-cleanup; status-panel snapshot test.
  • Fixture-repo lifecycle: GitHub fixtures private under Wave-Engineering/; GitLab fixture under gitlab.com/testtarget/. Per-run-id-prefixed create + prefix-scoped teardown.
  • Tier 4 v0 end-to-end tests: 4.1 (single-flight, single-issue, GitHub); 4.8 (single-flight, single-issue, GitLab — parametrized 4.1 + platform assertions); 4.6 (cross-repo, GitHub → GitHub).
  • Telemetry replay tooling — given a failed Tier 4 run, reconstruct timeline from ~/.claude/logs/mcp.jsonl + bus state into a single forensic doc.
  • Runner modes: --keep-state (preserve bus dirs, worktrees, partial logs on failure) and --bisect <failure-tag> (re-run only the suspect step).
  • Nightly cron + Discord report posting to a dedicated #harness-test channel.

Out of scope

Plan-level Definition of Done

Phases

Phase 1 — Tier 1 runtime

DoD:

  • Repo Wave-Engineering/ccwork-testtarget created (private), pytest skeleton in place, fine-grained PAT scoped to harness-prefixed names only
  • Tier 0 static checks run and pass: CLIs on PATH, MCP tools/list snapshot match, skill frontmatter parse, MEMORY.md integrity, WAVE_AXIOMS.md presence
  • Tier 1 wiring runs all four MCP server bun test suites and collects results
  • Bus scripts have unit tests in tmpdir (per lesson_destructive_test_homedir.md)
  • Status-panel snapshot test wired (regression for cc#631-class bugs)
  • Nightly cron + Discord posting to #harness-test works
  • Repo's own CI green on a freshly-cloned setup

Stories:

  • Story 1.1: Foundation — repo bootstrap + Python package skeleton (Wave-Engineering/ccwork-testtarget#1)
  • Story 1.2: MCPClient subprocess wrapper + JSON-RPC framing (Wave-Engineering/ccwork-testtarget#2)
  • Story 1.3: NotificationSubsystem + @-mention guardrail (Wave-Engineering/ccwork-testtarget#3)
  • Story 1.4: Structured event emission to mcp.jsonl (Wave-Engineering/ccwork-testtarget#4)
  • Story 1.5: Tier 0 static checks (Wave-Engineering/ccwork-testtarget#5)
  • Story 1.6: Tier 1 — MCP server bun test wiring (Wave-Engineering/ccwork-testtarget#6)
  • Story 1.7: Tier 1 — bus-script unit tests in tmpdir (Wave-Engineering/ccwork-testtarget#7)
  • Story 1.8: Tier 1 — status-panel snapshot test (Wave-Engineering/ccwork-testtarget#8)
  • Story 1.9: Tier 6 observability checks (Wave-Engineering/ccwork-testtarget#9)
  • Story 1.10: Tier-execution runner + --tier CLI (Wave-Engineering/ccwork-testtarget#10)
  • Story 1.11: GitHub Actions CI for harness helper-code unit tests (Wave-Engineering/ccwork-testtarget#11)
  • Story 1.12: Nightly cron deployment + state directories (Wave-Engineering/ccwork-testtarget#12)

Phase 2 — Tier 4 v0

DoD:

  • Fixture-repo lifecycle works for both GitHub and GitLab: per-run-id-prefixed create + prefix-scoped teardown
  • Test 4.1 (GitHub single-flight) passes end-to-end: kahuna branch created via wave_init, kahuna→main MR opened by wave_finalize, gate runs all 4 trust signals in a single tool-use block, pr_merge lands the kahuna→main MR, observability events captured in ~/.claude/logs/mcp.jsonl, status panel reflects terminal state, teardown clean
  • Test 4.8 (GitLab single-flight) passes: parametrized 4.1 + GitLab-specific assertions — glab adapter parity for pr_create / pr_merge / pr_status / pr_diff / pr_files / pr_wait_ci / pr_merge_wait; merge-train-warning emitted to Discord before pr_merge; approval rule scoped via protected_branch_ids permits the auto-merge; skip_train: true interpreted per platform (silent passthrough on GitLab; bypass-queue on GitHub)
  • Test 4.6 (cross-repo GitHub → GitHub) passes: pre-created worktrees in target repo, gh -R scoping on every command, no isolation: "worktree" flag misuse, kahuna branches in BOTH repos, both kahuna→main MRs land, wave-status state lives in master plan repo, worktree teardown unlocks before force-removal
  • Telemetry replay tooling produces a single forensic doc on a deliberately-broken run
  • --keep-state flag preserves bus dirs, worktrees, partial logs on failure
  • --bisect <failure-tag> mode re-runs only the suspect step

Stories:

  • Story 2.1: FixtureLifecycle helper (GitHub per-run-id, GitLab long-lived) (Wave-Engineering/ccwork-testtarget#13)
  • Story 2.2: ForensicGenerator + KeepState helpers (Wave-Engineering/ccwork-testtarget#14)
  • Story 2.3: CostTracker + pricing.yml + budget overage detection (Wave-Engineering/ccwork-testtarget#15)
  • Story 2.4: Tier 4 driver — claude CLI subprocess invocation (Wave-Engineering/ccwork-testtarget#16)
  • Story 2.5: Tier 4 Test 4.1 — GitHub single-flight (Wave-Engineering/ccwork-testtarget#17)
  • Story 2.6: Tier 4 Test 4.8 — GitLab single-flight (Wave-Engineering/ccwork-testtarget#18)
  • Story 2.7: Tier 4 Test 4.6 — cross-repo (GitHub → GitHub) (Wave-Engineering/ccwork-testtarget#19)
  • Story 2.8: --bisect and --keep-state runtime mechanics (Wave-Engineering/ccwork-testtarget#20)
  • Story 2.9: Self-validation contract — red Tier 4 test for one Plan Plan: Beta — Wave-pattern reliability + autonomy hardening (operational backlog campaign B) #607 bug (Wave-Engineering/ccwork-testtarget#21)

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    epic::626Story belongs to Epic #626 (PM-layer thematic grouping)priority::mediumShould do soontype::planPlan tracking issue — top-level pipeline containerurgency::normalNo special time pressure

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions