Add structured diagnostics and validation clarity by bma-d · Pull Request #21 · bmad-code-org/bmad-automator

bma-d · 2026-05-22T03:00:50Z

Summary

Draft PR for the observability and validation work tracked under issue #5.

This branch makes automator failures easier to diagnose across parser, state, agent-plan, and tmux/session boundaries while preserving successful CLI behavior and legacy output fields. The implementation follows the observability-validation plan: small typed/domain seams, additive diagnostics, focused tests, and no broad OO rewrite.

What Changed

Diagnostics contract

Added a shared diagnostics contract with stable serialized issue fields:
- type
- field
- expected
- actual
- message
- recovery
- code
- severity
- source
Added diagnostic serialization, legacy-message helpers, exception wrapping, and redaction behavior.
Kept command output compatibility by adding structuredIssues only beside legacy fields where applicable.

State validation and transitions

Added structured state validation for frontmatter/runtime-policy problems.
Preserved existing validate-state fields: ok, structure, and legacy issues strings.
Added structuredIssues and issueCount to validate-state.
Added allowed status transition checks for state-update.
Preserved recovery behavior: invalid legacy statuses can now be repaired to a valid status, while invalid attempted statuses are rejected.

Parser and verifier contracts

Moved parse schema validation into a focused contract helper.
Added field-path diagnostics for missing keys, wrong nested types, enum mismatches, empty strings, and invalid path/null fields.
Preserved successful parse payloads exactly.
Preserved legacy parse failure reason values and added structuredIssues on failures.
Added structured diagnostics to verifier contract failures where command outputs already carry errors.

Agent plan and complexity validation

Added complexity payload and agents-plan payload validation.
agents-build now fails early with structured diagnostics for malformed complexity/config payloads.
agents-resolve validates only the requested story/task path for direct resolution, preserving compatibility with partial user-authored agents files.
Retro agent resolution now routes through shared core agent-config helpers to avoid behavior drift.
Existing fallback normalization and retro overrides are preserved.

Session/runtime diagnostics

Added diagnostic-aware session-state loading in a focused core/session_state.py module.
Preserved legacy load_session_state() behavior returning {} for missing/unreadable/invalid/non-object state.
monitor-session --json now reports malformed persisted session-state diagnostics when relevant.
Fixed the Claude/auto legacy path so malformed state diagnostics are captured before legacy cleanup can delete the corrupt file.
CSV status commands remain unchanged.

CLI/tmux contract fixes

Added CLI contract coverage for parse-story, parse-story-range, agent config presets, tmux naming, project scoping, and wrapper behavior.
Fixed parse-story-range invalid totals to return stable JSON rather than raising before the error handler.
Fixed parse-story error reporting for missing rules versus invalid rules JSON.
Hardened agent-config preset file errors.
Added hash-scoped tmux session names to avoid same-basename worktree collisions.
Fixed tmux-wrapper name --cycle N to use the cycle value, not the flag token.
Preserved the legacy positional cycle argument form: tmux-wrapper name review 5 5.3 2.
Made kill-all project-scoped by default, with --all-projects as explicit opt-in.

Docs and plan hygiene

Updated operator docs for structured diagnostics, monitor-session JSON behavior, tmux session naming, and recovery hints.
Fixed crash-recovery hash snippets to use runtime helper commands for project-slug and project-hash.
Split older observability handoff entries into an archive file so the active handoff log stays under the repo’s soft file-size guidance while retaining clean-context continuity.

Compatibility Notes

validate-state keeps legacy issues: list[str] and adds structuredIssues plus issueCount.
Parse success output remains unchanged.
Parse failure output keeps legacy status: "error" and reason fields.
monitor-session --json preserves existing fields and adds structuredIssues only when relevant.
CSV status commands preserve exact output shape.
Direct agents-resolve remains compatible with partial plans containing the requested story/task.
Legacy tmux positional cycle naming remains supported.

Review Findings Addressed

This branch includes fixes from a staged subagent review loop:

Fixed lost malformed session-state diagnostics on the Claude/auto monitor path.
Fixed invalid legacy statuses being impossible to repair via state-update.
Fixed agents-resolve rejecting otherwise resolvable partial direct agents files.
Fixed non-portable crash-recovery project-hash docs.
Reduced duplicated retro-agent config resolution by routing through core helpers.
Extracted session-state diagnostics from the oversized tmux runtime module.
Split the oversized observability handoff log.
Fixed dropped legacy positional cycle support for tmux-wrapper name.
Removed duplicate monitor-session test coverage.
Ensured required new files are tracked.

Latest review baseline: P0/P1 clean.

Verification

Commands run locally:

git diff --check
python3 -m compileall -q skills/bmad-story-automator/src/story_automator
PYTHONPATH=skills/bmad-story-automator/src python3 -m unittest tests.test_state_validation tests.test_agent_plan tests.test_diagnostics_e2e tests.test_tmux_runtime tests.test_retro_agent tests.test_state_policy_metadata tests.test_cli_contracts
PYTHONPATH=skills/bmad-story-automator/src python3 -m unittest discover -s tests
npm run verify

Results:

Focused tests passed.
Full Python suite passed: 263 tests OK.
npm run verify passed.
Smoke test passed with known optional bmad-qa-generate-e2e-tests warnings only.

Known Gaps / Risks

No live external LLM/tmux integration E2E was added; coverage is local command/fixture based.
Some pre-existing modules remain above the soft file-size target, especially core/tmux_runtime.py and core/runtime_policy.py.
Smoke test still warns when optional QA generation skills are not installed; this is existing non-blocking behavior.

Refs #5

Summary by CodeRabbit

New Features
- Add structured, field-specific diagnostics alongside legacy error fields; opt-in JSONL diagnostic events
- Validate and block invalid orchestration status transitions before write
- Monitor-session JSON mode surfaces persisted session-state diagnostics; session naming gains project scoping
- Agent-plan/complexity validators with precise field-path errors
Bug Fixes
- Better detection/reporting of malformed persisted session state, parse failures, and agent-plan issues
- Redaction of sensitive values in legacy error outputs
Documentation
- Expanded observability/validation plans, CLI reference, troubleshooting, and runbooks
Tests
- New and expanded unit and E2E tests for diagnostics, state validation, agent plans, CLI contracts, and monitoring

coderabbitai · 2026-05-22T03:00:57Z

Caution

Review failed

Failed to post review comments

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

Walkthrough

Adds structured diagnostics primitives, parse-contract/state/agent-plan validation with field-path diagnostics, session-state diagnostic loading, CLI/tmux wiring that preserves legacy fields while adding structuredIssues, monitoring emit helper, agent-config/frontmatter parsing, and extensive unit and E2E tests and documentation.

Changes

Observability and Validation Infrastructure

Layer / File(s)	Summary
Diagnostics core `skills/bmad-story-automator/src/story_automator/core/diagnostics.py`, `tests/test_diagnostics.py`	Defines `DiagnosticIssue`/`DiagnosticEvent`, serialization, redaction, exception-to-issue helpers, and newline-delimited JSON event emission via `STORY_AUTOMATOR_DIAGNOSTICS_FILE`.
Parse contract validation `skills/bmad-story-automator/src/story_automator/core/parse_contracts.py`, `skills/bmad-story-automator/src/story_automator/core/review_verify.py`	Adds parse-contract loader/validator, `validate_payload`, `ParseContractError`, parse failure payload helpers, and wires standardized verifier exception payloads into verification flows.
State validation & transitions `skills/bmad-story-automator/src/story_automator/core/state_validation.py`, `skills/bmad-story-automator/src/story_automator/commands/orchestrator_state.py`, `tests/test_state_validation.py`	Centralized state field validation, `VALID_STATUSES`/`ALLOWED_STATUS_TRANSITIONS`, status-transition checks, `--set` parsing helpers, `status_transition_error_payload` with redaction, and structured/legacy-compatible response shaping.
Session-state diagnostics `skills/bmad-story-automator/src/story_automator/core/session_state.py`, `skills/bmad-story-automator/src/story_automator/core/tmux_runtime.py`, `tests/test_tmux_runtime.py`	Typed session-state loader with diagnostic categories (missing/unreadable/invalid/schema mismatch); `load_session_state()` preserves legacy `{}` behavior; helper to serialize session-state issue for `monitor-session --json`.
Agent plan validation & resolver `skills/bmad-story-automator/src/story_automator/core/agent_plan.py`, `tests/test_agent_plan.py`	Validates complexity and agents-plan payloads, loaders for embedded JSON blocks, build/resolve APIs, resolution-time validation for (story,task), and standardized `agent_plan_error` shapes.
AgentConfig frontmatter parser `skills/bmad-story-automator/src/story_automator/core/agent_config_frontmatter.py`	Parses `agentConfig:` frontmatter (top-level only) supporting inline maps and two-space indentation blocks; validates formatting and returns normalized dict.
Agent configuration core `skills/bmad-story-automator/src/story_automator/core/agent_config.py`	Centralized agent-config parsing/validation, deterministic frontmatter rendering, `AGENT_TASKS`/`AGENT_COMPLEXITY_LEVELS`, and delegation wrappers to agent_plan (including exposure of AgentPlanInputError).
Monitoring emit helper `skills/bmad-story-automator/src/story_automator/core/monitoring.py`	Adds `emit_monitor_result()` to emit `session.lifecycle.result` events and print JSON or CSV, optionally including `structuredIssues`.
CLI input & state command wiring `skills/bmad-story-automator/src/story_automator/cli.py`, `skills/bmad-story-automator/src/story_automator/commands/state.py`	Adds file existence/JSON decode checks for parse-story, delegates `cmd_validate_state` to core validators and returns `state_validation_payload`, and uses `render_agent_config_frontmatter` in `cmd_build_state_doc`.
Agent config command helper `skills/bmad-story-automator/src/story_automator/commands/agent_config_cmd.py`	Adds `_load_presets_or_report()` to standardize preset file read/parse error responses (JSON decode, OSError, UnicodeDecodeError).
Orchestrator CLI wiring & agents/retro refactor `skills/bmad-story-automator/src/story_automator/commands/orchestrator.py`, `skills/bmad-story-automator/src/story_automator/commands/orchestrator_epic_agents.py`	Wires `state-update` to new `state_update_action`, standardizes verifier exception payloads, modernizes `agents-build`/`agents-resolve`/`retro_agent` to use core agent_plan/agent_config APIs, and emits policy load/decision events.
Orchestrator parse refactor `skills/bmad-story-automator/src/story_automator/commands/orchestrator_parse.py`	Uses shared parse_contracts and diagnostics helpers, standardizes structured failures for parse/sub-agent outputs, and emits parse-stage diagnostic events.
Validate-story creation updates `skills/bmad-story-automator/src/story_automator/commands/validate_story_creation.py`	Emits `structuredIssues` for check failures, redacts override reasons, and annotates field identifiers in error diagnostics.
Tmux wrapper + monitor-session updates `skills/bmad-story-automator/src/story_automator/commands/tmux.py`, `skills/bmad-story-automator/src/story_automator/core/tmux_runtime.py`, `skills/bmad-story-automator/data/tmux-commands.md`	Adds `project_hash` to session filtering/naming, documents `kill-all --all-projects`, refactors `cmd_monitor_session` to use `emit_monitor_result()` and precompute session-state issues for JSON output, and simplifies argument parsing.
Monitor verification helpers `skills/bmad-story-automator/src/story_automator/commands/tmux_monitor.py`	Adds `parse_monitor_*` option helpers and `verify_monitor_completion` to resolve and run success verifiers with structured failure metadata.
Comprehensive tests and E2E `tests/*.py`	Adds/extends unit and E2E tests covering diagnostics serialization/redaction, parse contract validation, state validation/transitions, session-state diagnostics, agent-plan validation and CLI flows, retro-agent frontmatter parsing, tmux runtime/session tests, and CLI contract tests.
Docs and operational updates `docs/`, `skills/bmad-story-automator/data/`	Adds observability-validation phase plans, CLI reference updates documenting `structuredIssues`/`issueCount`, troubleshooting for malformed session state, crash-recovery patterns, handoff logs, implementation notes, and step/workflow docs.
Build/smoke test and tests pkg `scripts/smoke-test.sh`, `tests/__init__.py`	Smoke test packs npm tarball into TMP_DIR and sets PACK_TARBALL accordingly; `tests/__init__.py` initialized.

Estimated code review effort: 🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related issues:
- Increase automator observability and validation clarity #5 — This PR implements the observability/validation work described in that issue (diagnostics, parse/state/agent-plan validators, events).
Possibly related PRs:
- bmad-code-org/bmad-automator#2 — Overlaps tmux/runtime/session-state logic and naming changes.
- bmad-code-org/bmad-automator#16 — Overlaps epic-agent orchestration and per-task model threading.

"I'm a rabbit who audits logs at night,
I tuck each issue so the output's right,
With hashes, redaction, and structured light,
Diagnostics hop forth—soft and bright. 🐰"

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch bma-d/e2e-tests

coderabbitai

Actionable comments posted: 9

🧹 Nitpick comments (1)

skills/bmad-story-automator/data/tmux-commands.md (1)

39-70: ⚡ Quick win

Prefer helper-based project-slug/project-hash commands over repeated inline hash scripts.

Line 39-Line 70 duplicates slug/hash derivation logic in multiple snippets. Using helper subcommands in docs reduces drift when naming rules evolve.

Suggested doc patch

-project_slug=$(basename "$PWD" | tr '[:upper:]' '[:lower:]' | tr -cd '[:alnum:]' | cut -c1-8)
-project_hash=$(python3 - <<'PY'
-import hashlib, pathlib
-print(hashlib.md5(str(pathlib.Path.cwd().resolve()).encode(), usedforsecurity=False).hexdigest()[:8])
-PY
-)
+project_slug=$("$scripts" tmux-wrapper project-slug)
+project_hash=$("$scripts" tmux-wrapper project-hash)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@skills/bmad-story-automator/data/tmux-commands.md` around lines 39 - 70,
Replace the repeated inline derivation blocks that set project_slug,
project_hash, and session_name with calls to small helper commands or shell
functions (e.g., a get_project_slug and get_project_hash helper or single
sa-project-id command) and update every snippet that currently contains the
Python/hash and basename pipelines (the blocks that assign project_slug and
project_hash and the session_name usage) to invoke those helpers instead; ensure
the tmux examples (the tmux list-sessions grep and tmux list-sessions -F ... |
xargs kill-session lines) read helpers' output rather than re-duplicating the
scripts so future naming rules change in one place.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@skills/bmad-story-automator/data/crash-recovery.md`:
- Around line 24-28: Add the missing diagnostic key
session_state.unexpected_schema_version to the malformed-state list described
for monitor-session --json so it is treated the same as
session_state.invalid_json, session_state.invalid_type, and
session_state.unreadable (i.e., classify it as a runtime-state diagnostic and
include it in the guidance to verify workflow truth from story files and
sprint-status.yaml before retrying).

In
`@skills/bmad-story-automator/src/story_automator/commands/orchestrator_epic_agents.py`:
- Around line 122-126: The catch-all except maps every build_agents_file failure
to an "invalid_agent_config" pointing at "config-json", which misattributes
failures from state-file, complexity-file, or output; change the except handler
to inspect the exception or context returned by build_agents_file (or re-raise a
custom exception carrying the failing input) and call agent_plan_error with the
correct field (e.g., "state-file", "complexity-file", or "output") and include
the actual failing input path in issues_from_exception so diagnostics preserve
the real failing file; update the logic around build_agents_file,
agent_plan_error, and issues_from_exception to propagate and log the true source
of the error rather than always using "config-json".
- Around line 316-317: The default for "defaultFallback" is currently the string
"false", which can be serialized as a literal value and break resolve_agent();
change the default to a boolean False by using config.get("defaultFallback",
False) and, if needed, normalize incoming values (e.g., convert "false"/"true"
strings to booleans) so resolve_agent() always receives a real boolean or a
proper agent identifier rather than the string "false".

In
`@skills/bmad-story-automator/src/story_automator/commands/orchestrator_parse.py`:
- Around line 41-46: The current except block lumps
FileNotFoundError/json.JSONDecodeError/ValueError/PolicyError into a
parse-contract error with field "parse.schemaPath"; separate those so parse
failures still call print_json(parse_failure_payload("parse_contract_invalid",
issues_from_exception(..., source="parse-contract", field="parse.schemaPath")))
for parsing-related exceptions (FileNotFoundError, json.JSONDecodeError,
ValueError) but handle PolicyError and state/runtime errors coming from
load_runtime_policy and step_contract with their own payloads (e.g.,
source="runtime-policy" or "step-contract" and an appropriate field like
"runtime.policy" or "state.file"), i.e., add distinct except clauses for
PolicyError and any state-related FileNotFoundError raised by
load_runtime_policy/step_contract and call
print_json(parse_failure_payload(...)) with the correct source/field for those
exceptions instead of reusing "parse.schemaPath".

In `@skills/bmad-story-automator/src/story_automator/core/agent_plan.py`:
- Around line 30-33: The code currently uses "complexity =
story.get('complexity') or {}" which coerces falsy but present values (e.g., "",
0, False) into missing; change to "complexity = story.get('complexity')" and
adjust the subsequent check to only validate when complexity is not None (e.g.,
"if complexity is not None and not isinstance(complexity, dict):
issues.append(_issue(...))"), so explicit invalid falsy values trigger the type
diagnostic via the existing _issue call instead of being treated as absent.

In `@skills/bmad-story-automator/src/story_automator/core/diagnostics.py`:
- Around line 80-87: The DiagnosticIssue created in issues_from_exception stores
raw str(exc) into message which bypasses redaction; change issues_from_exception
so that the same redaction/transformation used for DiagnosticIssue.actual is
applied to the message (e.g., call the shared redactor/serializer or helper used
elsewhere when building DiagnosticIssue.actual) so both actual and message
contain redacted output; update the DiagnosticIssue construction in
issues_from_exception to pass the redacted version for message (and ensure
message falls back to the exception class name if redaction yields empty).

In `@skills/bmad-story-automator/src/story_automator/core/session_state.py`:
- Around line 25-28: The except blocks that currently catch (OSError,
json.JSONDecodeError) around the read_text(target) + json.loads calls must also
catch UnicodeDecodeError so text-decoding failures fall back to {} instead of
bubbling up; update the exception tuples (the try that assigns raw =
json.loads(read_text(target)) and the similar try/except later) to include
UnicodeDecodeError (or add a separate except UnicodeDecodeError) so read_text,
json.loads and the target handling in session_state.py handle decoding errors
the same as unreadable files.

In `@skills/bmad-story-automator/src/story_automator/core/state_validation.py`:
- Around line 115-119: The function has_runtime_command_config treats
whitespace-only strings as valid; update it so that string values are accepted
only if ai_command is a non-empty string after stripping whitespace, and for
list values accept only if the list contains at least one non-empty (after
strip) string element; otherwise fall back to calling
_has_agent_config_block(frontmatter). Modify has_runtime_command_config to
perform isinstance checks for str and list and use .strip() on string elements
to reject whitespace-only entries.

In `@skills/bmad-story-automator/steps-c/step-02b-preflight-finalize.md`:
- Around line 77-82: The two separate date -u calls can yield inconsistent
timestamps; capture a single timestamp into a variable (e.g., TS_NOW) and reuse
it for the orchestrator-helper state-update --set lastUpdated and the echo
append to "{outputFile}" so both entries use the exact same ISO UTC time; update
the commands around the state-update invocation (state-update) and the echo line
that appends to "{outputFile}" to reference that variable instead of calling
date twice.

---

Nitpick comments:
In `@skills/bmad-story-automator/data/tmux-commands.md`:
- Around line 39-70: Replace the repeated inline derivation blocks that set
project_slug, project_hash, and session_name with calls to small helper commands
or shell functions (e.g., a get_project_slug and get_project_hash helper or
single sa-project-id command) and update every snippet that currently contains
the Python/hash and basename pipelines (the blocks that assign project_slug and
project_hash and the session_name usage) to invoke those helpers instead; ensure
the tmux examples (the tmux list-sessions grep and tmux list-sessions -F ... |
xargs kill-session lines) read helpers' output rather than re-duplicating the
scripts so future naming rules change in one place.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: aac36e51-5224-48f0-bef9-b237f2f801ff

📥 Commits

Reviewing files that changed from the base of the PR and between 33601b9 and 0f3e547.

📒 Files selected for processing (46)

docs/agents-and-monitoring.md
docs/cli-reference.md
docs/how-it-works.md
docs/plans/observability-validation/00-baseline-and-plan-reconciliation.md
docs/plans/observability-validation/01-diagnostics-contract.md
docs/plans/observability-validation/02-state-validation-and-transitions.md
docs/plans/observability-validation/03-parser-and-contract-boundaries.md
docs/plans/observability-validation/04-agent-complexity-and-story-boundaries.md
docs/plans/observability-validation/05-session-runtime-diagnostics.md
docs/plans/observability-validation/06-e2e-docs-and-release-readiness.md
docs/plans/observability-validation/README.md
docs/plans/observability-validation/TODO.md
docs/plans/observability-validation/handoff-log-archive-phase-00-04.md
docs/plans/observability-validation/handoff-log.md
docs/plans/observability-validation/implementation-notes.md
docs/state-and-resume.md
docs/troubleshooting.md
scripts/smoke-test.sh
skills/bmad-story-automator/data/crash-recovery.md
skills/bmad-story-automator/data/monitoring-pattern.md
skills/bmad-story-automator/data/tmux-commands.md
skills/bmad-story-automator/src/story_automator/cli.py
skills/bmad-story-automator/src/story_automator/commands/agent_config_cmd.py
skills/bmad-story-automator/src/story_automator/commands/orchestrator.py
skills/bmad-story-automator/src/story_automator/commands/orchestrator_epic_agents.py
skills/bmad-story-automator/src/story_automator/commands/orchestrator_parse.py
skills/bmad-story-automator/src/story_automator/commands/state.py
skills/bmad-story-automator/src/story_automator/commands/tmux.py
skills/bmad-story-automator/src/story_automator/core/agent_plan.py
skills/bmad-story-automator/src/story_automator/core/diagnostics.py
skills/bmad-story-automator/src/story_automator/core/parse_contracts.py
skills/bmad-story-automator/src/story_automator/core/review_verify.py
skills/bmad-story-automator/src/story_automator/core/session_state.py
skills/bmad-story-automator/src/story_automator/core/state_validation.py
skills/bmad-story-automator/src/story_automator/core/tmux_runtime.py
skills/bmad-story-automator/steps-c/step-02b-preflight-finalize.md
skills/bmad-story-automator/steps-v/step-v-01-check.md
tests/__init__.py
tests/test_agent_plan.py
tests/test_cli_contracts.py
tests/test_diagnostics.py
tests/test_diagnostics_e2e.py
tests/test_orchestrator_parse.py
tests/test_state_validation.py
tests/test_success_verifiers.py
tests/test_tmux_runtime.py

bma-d · 2026-05-22T10:00:20Z

@coderabbitai review please

coderabbitai · 2026-05-22T10:00:35Z

@bma-d Sure! I'll start a full review of PR #21 right away.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

🧹 Nitpick comments (2)

skills/bmad-story-automator/src/story_automator/commands/orchestrator_parse.py (1)
38-59: 💤 Low value

Message-based error routing may be fragile.

The routing at lines 40-41 depends on specific substrings in exception messages ("parse schema" or "policy data file missing"). If the exception message wording changes upstream, the error will be misclassified as runtime_policy_invalid instead of parse_contract_invalid.

Consider using distinct exception types or error codes rather than message inspection if the upstream module can be modified. However, if this is a deliberate pragmatic choice and the upstream messages are stable, this is acceptable.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@skills/bmad-story-automator/src/story_automator/commands/orchestrator_parse.py`
around lines 38 - 59, The current PolicyError handling in orchestrator_parse.py
relies on fragile substring checks of str(exc) to decide between
parse_contract_invalid and runtime_policy_invalid; replace this brittle logic by
using a concrete discriminator on the exception (e.g. an explicit exception
subclass or a stable error code/attribute) instead: either have the upstream
raise/attach a ParseSchemaError or an error.code like
"parse_schema"/"policy_data_missing", or add a local classify_policy_error(exc)
helper that inspects exc.code or isinstance(exc, ParseSchemaError) and returns
the correct category; then route to print_json(parse_failure_payload(...)) using
that discriminator (keeping the same issues_from_exception(...) usage) so
decision logic no longer depends on message text.
skills/bmad-story-automator/src/story_automator/core/agent_plan.py (1)
229-230: 💤 Low value

Unused call to normalize_fallback_value.

The call on line 230 has no side effect and its return value is discarded. normalize_fallback_value performs string normalization but doesn't validate or raise exceptions, so this call serves no purpose.
Suggested fix
     if not (fallback is False or isinstance(fallback, str)):
         issues.append(_issue("invalid_type", f"{task_field}.fallback", "false or string", fallback, f"{task} fallback must be false or a string"))
-    elif isinstance(fallback, str):
-        normalize_fallback_value(fallback)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@skills/bmad-story-automator/src/story_automator/core/agent_plan.py` around
lines 229 - 230, The call to normalize_fallback_value(fallback) is unused
because its return value is discarded; either remove the call or use its result.
Update the branch handling (the elif that checks isinstance(fallback, str)) to
assign the normalized string back (e.g., fallback =
normalize_fallback_value(fallback)) or otherwise pass the normalized value into
the downstream code that consumes fallback so the normalization has an effect;
ensure you reference normalize_fallback_value and the local variable fallback in
your change.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In
`@skills/bmad-story-automator/src/story_automator/commands/orchestrator_parse.py`:
- Around line 38-59: The current PolicyError handling in orchestrator_parse.py
relies on fragile substring checks of str(exc) to decide between
parse_contract_invalid and runtime_policy_invalid; replace this brittle logic by
using a concrete discriminator on the exception (e.g. an explicit exception
subclass or a stable error code/attribute) instead: either have the upstream
raise/attach a ParseSchemaError or an error.code like
"parse_schema"/"policy_data_missing", or add a local classify_policy_error(exc)
helper that inspects exc.code or isinstance(exc, ParseSchemaError) and returns
the correct category; then route to print_json(parse_failure_payload(...)) using
that discriminator (keeping the same issues_from_exception(...) usage) so
decision logic no longer depends on message text.

In `@skills/bmad-story-automator/src/story_automator/core/agent_plan.py`:
- Around line 229-230: The call to normalize_fallback_value(fallback) is unused
because its return value is discarded; either remove the call or use its result.
Update the branch handling (the elif that checks isinstance(fallback, str)) to
assign the normalized string back (e.g., fallback =
normalize_fallback_value(fallback)) or otherwise pass the normalized value into
the downstream code that consumes fallback so the normalization has an effect;
ensure you reference normalize_fallback_value and the local variable fallback in
your change.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: da7decd1-ae17-479a-8468-dc05dcfb3ad8

📥 Commits

Reviewing files that changed from the base of the PR and between 0f3e547 and 385c22a.

📒 Files selected for processing (20)

skills/bmad-story-automator/data/crash-recovery.md
skills/bmad-story-automator/data/tmux-commands.md
skills/bmad-story-automator/src/story_automator/commands/orchestrator_epic_agents.py
skills/bmad-story-automator/src/story_automator/commands/orchestrator_parse.py
skills/bmad-story-automator/src/story_automator/commands/state.py
skills/bmad-story-automator/src/story_automator/commands/tmux.py
skills/bmad-story-automator/src/story_automator/core/agent_config.py
skills/bmad-story-automator/src/story_automator/core/agent_config_frontmatter.py
skills/bmad-story-automator/src/story_automator/core/agent_plan.py
skills/bmad-story-automator/src/story_automator/core/diagnostics.py
skills/bmad-story-automator/src/story_automator/core/monitoring.py
skills/bmad-story-automator/src/story_automator/core/session_state.py
skills/bmad-story-automator/src/story_automator/core/state_validation.py
skills/bmad-story-automator/steps-c/step-02b-preflight-finalize.md
tests/test_agent_plan.py
tests/test_diagnostics.py
tests/test_orchestrator_parse.py
tests/test_retro_agent.py
tests/test_state_validation.py
tests/test_tmux_runtime.py

✅ Files skipped from review due to trivial changes (1)

skills/bmad-story-automator/data/crash-recovery.md

bma-d · 2026-05-22T10:54:46Z

@augmentcode review please

augmentcode · 2026-05-22T11:00:01Z

🤖 Augment PR Summary

Summary: This PR adds structured, field-addressable diagnostics across the story automator's parser, state, agent-plan, and session boundaries while preserving backward-compatible CLI output. Tracked under issue #5.

Changes:

Added a shared diagnostics contract (DiagnosticIssue/DiagnosticEvent) with serialization, redaction, and opt-in JSONL event emission via STORY_AUTOMATOR_DIAGNOSTICS_FILE
Added structured state validation with field-specific diagnostics and allowed status transition guards for state-update
Added parse contract schema leaf validation before sub-agent execution, with field-path diagnostics for malformed parse payloads
Added complexity and agents-plan payload validators; agents-build/agents-resolve now fail early with structuredIssues
Added diagnostic-aware session-state loading; monitor-session --json surfaces structuredIssues for malformed state
Extracted orchestrator_state.py, tmux_monitor.py, session_state.py, agent_config_frontmatter.py, agent_plan.py, parse_contracts.py, orchestration_events.py, and monitoring.py as focused modules
Restricted frontmatter replacements to the frontmatter section only; restored kill-all default to all sessions; fixed --cycle flag parsing
Added ~1600 lines of new tests across 7 test modules covering diagnostics, state validation, agent plans, CLI contracts, and E2E scenarios

Technical Notes: All changes are additive—legacy fields (issues, reason, error) are preserved alongside new structuredIssues. Successful parse and CSV outputs remain unchanged. 263→310+ tests pass locally with npm run verify.

_{🤖 Was this summary useful? React with 👍 or 👎}

augmentcode

Review completed. 3 suggestions posted.

Comment augment review to trigger a new review at any time.

bma-d · 2026-05-22T13:59:55Z

@augmentcode review please
@coderabbitai review please

coderabbitai · 2026-05-22T14:00:10Z

@bma-d Sure! I'll kick off a full review of PR #21 right away.

✅ Actions performed

Full review triggered.

augmentcode · 2026-05-22T14:00:13Z

This pull request is abnormally large and would use a significant amount of tokens to review. If you still wish to review it, comment "augment review" and we will review it.

bma-d · 2026-05-23T00:16:54Z

augment review

augmentcode

Review completed. 5 suggestions posted.

Comment augment review to trigger a new review at any time.

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

skills/bmad-story-automator/src/story_automator/core/diagnostics.py (1)

65-73: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Redact event-level message before serialization.

Line 69 writes event.message unredacted. If any caller passes dynamic error text, this can leak sensitive/path data to diagnostics logs.

🔧 Proposed fix

 def serialize_event(event: DiagnosticEvent) -> dict[str, Any]:
     return {
         "name": event.name,
         "source": event.source,
-        "message": event.message,
+        "message": redact_actual(event.message),
         "severity": event.severity,
         "issues": serialize_issues(event.issues),
         "context": redact_actual(event.context),
     }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@skills/bmad-story-automator/src/story_automator/core/diagnostics.py` around
lines 65 - 73, serialize_event currently includes event.message unredacted which
can leak sensitive data; update serialize_event (in the serialize_event
function) to pass the message through the existing redact_actual function (e.g.,
use redact_actual(event.message)) before putting it in the returned dict,
ensuring the "message" field is redacted like "context" and keeping other fields
(name, source, severity, issues) unchanged.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/test_diagnostics.py`:
- Around line 4-5: Add an explicit import for the unittest.mock module so the
test file is self-contained; specifically, at the top of
tests/test_diagnostics.py add "import unittest.mock" (or otherwise ensure
unittest.mock is imported) because later code calls
unittest.mock.patch.dict(...) — update the imports block that currently contains
"import tempfile" and "import unittest" to include the explicit unittest.mock
import.

---

Outside diff comments:
In `@skills/bmad-story-automator/src/story_automator/core/diagnostics.py`:
- Around line 65-73: serialize_event currently includes event.message unredacted
which can leak sensitive data; update serialize_event (in the serialize_event
function) to pass the message through the existing redact_actual function (e.g.,
use redact_actual(event.message)) before putting it in the returned dict,
ensuring the "message" field is redacted like "context" and keeping other fields
(name, source, severity, issues) unchanged.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 65372596-1476-4c8f-9f61-ae663e010077

📥 Commits

Reviewing files that changed from the base of the PR and between 385c22a and bf2740a.

📒 Files selected for processing (31)

docs/cli-reference.md
docs/plans/observability-validation/07-review-remediation.md
docs/plans/observability-validation/08-diagnostic-redaction-completion.md
docs/plans/observability-validation/README.md
docs/plans/observability-validation/TODO.md
docs/plans/observability-validation/TODO/phase-08.md
docs/plans/observability-validation/gate-map.md
docs/plans/observability-validation/handoff-log.md
docs/plans/observability-validation/implementation-notes.md
skills/bmad-story-automator/src/story_automator/cli.py
skills/bmad-story-automator/src/story_automator/commands/orchestrator.py
skills/bmad-story-automator/src/story_automator/commands/orchestrator_epic_agents.py
skills/bmad-story-automator/src/story_automator/commands/orchestrator_parse.py
skills/bmad-story-automator/src/story_automator/commands/state.py
skills/bmad-story-automator/src/story_automator/commands/tmux.py
skills/bmad-story-automator/src/story_automator/commands/validate_story_creation.py
skills/bmad-story-automator/src/story_automator/core/agent_plan.py
skills/bmad-story-automator/src/story_automator/core/diagnostics.py
skills/bmad-story-automator/src/story_automator/core/monitoring.py
skills/bmad-story-automator/src/story_automator/core/orchestration_events.py
skills/bmad-story-automator/src/story_automator/core/parse_contracts.py
skills/bmad-story-automator/src/story_automator/core/runtime_policy.py
skills/bmad-story-automator/src/story_automator/core/state_validation.py
tests/test_agent_plan.py
tests/test_cli_contracts.py
tests/test_diagnostics.py
tests/test_diagnostics_e2e.py
tests/test_orchestrator_parse.py
tests/test_state_policy_metadata.py
tests/test_state_validation.py
tests/test_success_verifiers.py

✅ Files skipped from review due to trivial changes (6)

docs/plans/observability-validation/07-review-remediation.md
docs/plans/observability-validation/08-diagnostic-redaction-completion.md
docs/plans/observability-validation/gate-map.md
docs/plans/observability-validation/README.md
docs/plans/observability-validation/TODO.md
docs/plans/observability-validation/implementation-notes.md

bma-d · 2026-05-25T09:23:51Z

augment review

bma-d · 2026-05-25T09:24:03Z

@coderabbitai review please

coderabbitai · 2026-05-25T09:24:14Z

@bma-d Sure! I'll kick off a full review of PR #21 right away.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 6

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@skills/bmad-story-automator/src/story_automator/commands/agent_config_cmd.py`:
- Around line 96-104: The wrapper _load_presets_or_report currently catches
json.JSONDecodeError and OSError but misses UnicodeDecodeError (a separate
exception) so UTF-8 decode failures aren’t returned as the structured
presets_file_error; update _load_presets_or_report to catch UnicodeDecodeError
(e.g., add an except UnicodeDecodeError as exc:) and call print_json({"ok":
False, "error": "presets_file_error", "reason": str(exc)}) and return None (or
alternatively combine it with the OSError handling) so decode errors from
load_presets_file are reported consistently.

In `@skills/bmad-story-automator/src/story_automator/commands/orchestrator.py`:
- Around line 305-308: The current re.subn call (using
re.subn(rf"(?m)^{re.escape(key)}:.*$", ... ) on variable text) updates the
entire markdown, causing body lines to be overwritten; restrict replacements to
the YAML frontmatter only by extracting the frontmatter block (delimited by
leading and trailing '---' or '---\\n'), run the re.subn and updated.append(key)
against that frontmatter string (using the existing key/value logic and
re.subn), then reassemble text = frontmatter + rest_of_document; update
references to text, key, value, updated and the re.subn call in orchestrator.py
so only the frontmatter is modified.

In
`@skills/bmad-story-automator/src/story_automator/core/agent_config_frontmatter.py`:
- Around line 10-12: The code currently uses stripped = raw_line.strip() then
stripped.startswith("agentConfig:"), which treats indented keys like " 
agentConfig:" as top-level; change the detection to be left-whitespace sensitive
by checking raw_line.startswith("agentConfig:") (i.e., do not remove leading
whitespace) before calling _extract_agent_config_block(frontmatter.splitlines(),
index), so only truly top-level unindented "agentConfig:" lines trigger
extraction; update the raw_line handling accordingly while keeping the call to
_extract_agent_config_block with frontmatter and index.
- Around line 38-45: The loop over enumerate(lines) in
agent_config_frontmatter.py allows lines like "  \tkey: value" because it only
checks for tabs at line start earlier; update the loop that calls
_strip_inline_yaml_comment and computes indent to also reject any tabs in the
indentation portion (or anywhere in the line after stripping inline comments).
Concretely, after computing indent = len(line) - len(line.lstrip(" ")), assert
that line[:indent] contains only spaces (no '\t') and/or if '\t' in line: raise
the same ValueError (or a clearer one) so lines with tabs after spaces are
rejected consistently by this function.

In `@skills/bmad-story-automator/src/story_automator/core/diagnostics.py`:
- Around line 21-25: The current regexes ABSOLUTE_PATH_WITH_EXT_RE and
ABSOLUTE_PATH_RE only match Unix-style paths and must be extended to also detect
Windows absolute paths (e.g., "C:\Users\..." or "C:/Users/...") to avoid leaking
local Windows paths; update the definitions for ABSOLUTE_PATH_WITH_EXT_RE and
ABSOLUTE_PATH_RE to include alternative branches that match a drive letter plus
colon and backslash/forward slash patterns and ensure backslashes are correctly
escaped in the raw string, then apply the same updated regexes where those
symbols are redefined/used later (the other occurrence referenced in the
comment) so both the with-extension and generic absolute-path redaction cover
Windows paths.

In `@skills/bmad-story-automator/src/story_automator/core/state_validation.py`:
- Around line 145-148: The parsed "--set" handler currently returns the raw key
which allows whitespace-padded keys to bypass validation; update the block
around "key, value = raw.split('=', 1)" to normalize the key by calling key =
key.strip() (and optionally strip value as well) before returning so the
function (and downstream validation in state transition checks) receives a
trimmed key; keep the existing error branch that calls
state_update_argument_error_payload(raw).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d9bbb554-a626-414a-8257-d556cc99fa2e

📥 Commits

Reviewing files that changed from the base of the PR and between bf2740a and cb3f5dd.

📒 Files selected for processing (58)

docs/agents-and-monitoring.md
docs/cli-reference.md
docs/how-it-works.md
docs/plans/observability-validation/00-baseline-and-plan-reconciliation.md
docs/plans/observability-validation/01-diagnostics-contract.md
docs/plans/observability-validation/02-state-validation-and-transitions.md
docs/plans/observability-validation/03-parser-and-contract-boundaries.md
docs/plans/observability-validation/04-agent-complexity-and-story-boundaries.md
docs/plans/observability-validation/05-session-runtime-diagnostics.md
docs/plans/observability-validation/06-e2e-docs-and-release-readiness.md
docs/plans/observability-validation/07-review-remediation.md
docs/plans/observability-validation/08-diagnostic-redaction-completion.md
docs/plans/observability-validation/README.md
docs/plans/observability-validation/TODO.md
docs/plans/observability-validation/TODO/phase-08.md
docs/plans/observability-validation/gate-map.md
docs/plans/observability-validation/handoff-log-archive-phase-00-04.md
docs/plans/observability-validation/handoff-log.md
docs/plans/observability-validation/implementation-notes.md
docs/state-and-resume.md
docs/troubleshooting.md
scripts/smoke-test.sh
skills/bmad-story-automator/data/crash-recovery.md
skills/bmad-story-automator/data/monitoring-pattern.md
skills/bmad-story-automator/data/tmux-commands.md
skills/bmad-story-automator/src/story_automator/cli.py
skills/bmad-story-automator/src/story_automator/commands/agent_config_cmd.py
skills/bmad-story-automator/src/story_automator/commands/orchestrator.py
skills/bmad-story-automator/src/story_automator/commands/orchestrator_epic_agents.py
skills/bmad-story-automator/src/story_automator/commands/orchestrator_parse.py
skills/bmad-story-automator/src/story_automator/commands/state.py
skills/bmad-story-automator/src/story_automator/commands/tmux.py
skills/bmad-story-automator/src/story_automator/commands/validate_story_creation.py
skills/bmad-story-automator/src/story_automator/core/agent_config.py
skills/bmad-story-automator/src/story_automator/core/agent_config_frontmatter.py
skills/bmad-story-automator/src/story_automator/core/agent_plan.py
skills/bmad-story-automator/src/story_automator/core/diagnostics.py
skills/bmad-story-automator/src/story_automator/core/monitoring.py
skills/bmad-story-automator/src/story_automator/core/orchestration_events.py
skills/bmad-story-automator/src/story_automator/core/parse_contracts.py
skills/bmad-story-automator/src/story_automator/core/review_verify.py
skills/bmad-story-automator/src/story_automator/core/runtime_policy.py
skills/bmad-story-automator/src/story_automator/core/session_state.py
skills/bmad-story-automator/src/story_automator/core/state_validation.py
skills/bmad-story-automator/src/story_automator/core/tmux_runtime.py
skills/bmad-story-automator/steps-c/step-02b-preflight-finalize.md
skills/bmad-story-automator/steps-v/step-v-01-check.md
tests/__init__.py
tests/test_agent_plan.py
tests/test_cli_contracts.py
tests/test_diagnostics.py
tests/test_diagnostics_e2e.py
tests/test_orchestrator_parse.py
tests/test_retro_agent.py
tests/test_state_policy_metadata.py
tests/test_state_validation.py
tests/test_success_verifiers.py
tests/test_tmux_runtime.py

✅ Files skipped from review due to trivial changes (18)

docs/plans/observability-validation/gate-map.md
docs/state-and-resume.md
skills/bmad-story-automator/data/monitoring-pattern.md
docs/plans/observability-validation/03-parser-and-contract-boundaries.md
docs/agents-and-monitoring.md
docs/plans/observability-validation/08-diagnostic-redaction-completion.md
docs/plans/observability-validation/00-baseline-and-plan-reconciliation.md
docs/plans/observability-validation/07-review-remediation.md
docs/cli-reference.md
docs/plans/observability-validation/README.md
docs/troubleshooting.md
tests/init.py
docs/how-it-works.md
docs/plans/observability-validation/06-e2e-docs-and-release-readiness.md
skills/bmad-story-automator/data/crash-recovery.md
docs/plans/observability-validation/04-agent-complexity-and-story-boundaries.md
docs/plans/observability-validation/02-state-validation-and-transitions.md
docs/plans/observability-validation/TODO.md

bma-d · 2026-05-25T10:59:38Z

@coderabbitai review

coderabbitai · 2026-05-25T10:59:43Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

bma-d · 2026-05-25T10:59:49Z

@augmentcode review

augmentcode

Review completed. 5 suggestions posted.

Comment augment review to trigger a new review at any time.

bma-d · 2026-05-26T11:01:35Z

@augmentcode review
@coderabbitai review

coderabbitai · 2026-05-26T11:01:48Z

@bma-d Sure, I'll kick off a review of PR #21 right away!

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

augmentcode

Review completed. 5 suggestions posted.

Comment augment review to trigger a new review at any time.

bma-d · 2026-05-26T12:57:38Z

@augmentcode review
@coderabbitai review

coderabbitai · 2026-05-26T12:58:18Z

@bma-d Sure! I'll kick off a review of PR #21 right away.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

augmentcode

Review completed. 6 suggestions posted.

Comment augment review to trigger a new review at any time.

augmentcode · 2026-05-26T13:18:21Z

+        print_json(updates)
+        return 1
+
+    pending_status = str(fields.get("status") or "")


parse_simple_frontmatter reads from the full document (not just the frontmatter section), so fields.get("status") could pick up a status: line from the body if the body happens to start with one. Consider parsing from the split frontmatter instead of the full text to stay consistent with the replacement scope restricted by _split_frontmatter.

Severity: medium

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

augmentcode · 2026-05-26T13:18:21Z

+    complexity, issue = _story_complexity(story, field)
+    if issue:
+        raise AgentPlanInputError("complexity-file", ValueError(legacy_issue_message(issue)))
+    return str(complexity.get("level") or "medium").strip().lower() or "medium"


complexity.get("level") or "medium" coerces explicit falsy level values like "" or 0 to "medium" silently, bypassing the COMPLEXITY_LEVELS check. The preceding validate_complexity_payload does the same coercion on line 43 (or "medium"), so an empty-string level passes validation and defaults to medium without any diagnostic. Consider rejecting empty-string levels explicitly in the validator.

Severity: medium

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

augmentcode · 2026-05-26T13:18:21Z

+def verifier_exception_payload(reason: str, exc: Exception, *, source: str, field: str = "", **extra: object) -> dict[str, object]:
+    issues = issues_from_exception(exc, source=source, field=field)
+    redacted_extra = redact_actual(extra)
+    return {"verified": False, "reason": reason, "error": redact_actual(str(exc)), **redacted_extra, "structuredIssues": serialize_issues(issues)}


The **redacted_extra spread can silently overwrite the preceding "reason" or "error" keys if the caller passes reason= or error= as extra kwargs, since dict unpacking resolves left-to-right. Consider placing **redacted_extra before the explicit keys, or validating that extra doesn't contain reserved field names.

Severity: medium

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

augmentcode · 2026-05-26T13:18:22Z

    start = time.time()
    last_done = 0
    last_total = 0
+    session_state_issue = monitor_session_state_issue(session, project_root) if json_output else None


monitor_session_state_issue(session, project_root) reads and parses the session state file eagerly before the poll loop begins. If the file doesn't exist yet (runner hasn't created it), this always returns None, and on the not_found path at line 443 it re-checks. However, if the state file was valid at poll start and becomes corrupted mid-loop, the pre-cached None is used and the re-check at line 443 catches it. This is fine, but the pre-poll I/O is wasted for sessions that complete/crash/stuck without reaching not_found.

Severity: low

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

augmentcode · 2026-05-26T13:18:22Z

+    raw_value = _strip_inline_yaml_comment(raw_value)
+    if raw_value:
+        parsed = _parse_scalar(raw_value)
+        return parsed if isinstance(parsed, dict) else {"agentConfig": parsed}


When agentConfig: has a non-dict inline value (e.g. agentConfig: bad), this returns {"agentConfig": "bad"} which wraps the value under an agentConfig key. Downstream parse_agent_config_json will then see a nested agentConfig key and raise "unexpected nested agentConfig key", which is a confusing error message for what is actually a scalar value on the header line.

Severity: low

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

augmentcode · 2026-05-26T13:18:22Z

+    key, value = raw.split("=", 1)
+    if not key.strip():
+        return state_update_argument_error_payload(raw)
+    return key.strip(), value.strip()


The value is .strip()-ped here, but this means --set status= IN_PROGRESS writes status: IN_PROGRESS to frontmatter (stripped). While the key stripping is clearly necessary, stripping the value changes the semantics — a value like --set customInstructions= keep leading spaces would lose its intentional leading/trailing whitespace. Consider stripping only for status transitions and preserving the raw value for non-status updates.

Severity: low

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

coderabbitai Bot reviewed May 22, 2026

View reviewed changes

bma-d marked this pull request as ready for review May 22, 2026 10:54

augmentcode Bot reviewed May 22, 2026

View reviewed changes

Comment thread skills/bmad-story-automator/src/story_automator/core/parse_contracts.py Outdated

Comment thread skills/bmad-story-automator/src/story_automator/commands/state.py Outdated

Comment thread skills/bmad-story-automator/src/story_automator/cli.py

augmentcode Bot reviewed May 23, 2026

View reviewed changes

coderabbitai Bot reviewed May 23, 2026

View reviewed changes

Comment thread tests/test_diagnostics.py

bma-d added 15 commits May 25, 2026 06:02

docs: record observability baseline

5e4e6ba

feat: add diagnostics contract

86401b2

feat: add state validation diagnostics

d0da5cc

feat: add parser contract diagnostics

e1b6175

feat: validate agent plan payloads

9c4ba1d

feat: add session state diagnostics

ce8d963

test: add observability diagnostics e2e coverage

588bd07

fix: resolve observability review findings

231cec0

fix: address observability review findings

63ab560

fix: normalize agent config rendering

eeb6da5

fix: validate complexity override config

db7b72d

fix: validate nested complexity overrides

d293fe9

fix: validate frontmatter complexity overrides

8f357b3

fix: handle complexity frontmatter edge cases

4ead146

fix: reject empty complexity override fields

2863ac3

bma-d added 5 commits May 25, 2026 06:11

fix: address bot diagnostics review items

781c71f

fix: complete diagnostics redaction follow-ups

3b1b586

fix: close observability review gaps

eec32a8

fix: address review validation gaps

5e26efd

fix: harden diagnostic command error paths

cb3f5dd

bma-d force-pushed the bma-d/e2e-tests branch from fddf414 to cb3f5dd Compare May 25, 2026 09:21

coderabbitai Bot reviewed May 25, 2026

View reviewed changes

bma-d added 3 commits May 25, 2026 07:26

fix: address coderabbit diagnostics findings

083cc11

refactor: split orchestrator state update

6f5cf03

fix: harden agent complexity build path

87decbd

augmentcode Bot reviewed May 25, 2026

View reviewed changes

bma-d added 3 commits May 25, 2026 08:39

fix: address augment validation findings

6d63159

fix: preserve tmux session compatibility

e344097

fix: preserve legacy project session listing

b1f0206

bma-d requested a review from alexeyv May 26, 2026 08:36

fix: address pr review redaction and tmux filtering

79fc518

augmentcode Bot reviewed May 26, 2026

View reviewed changes

bma-d added 2 commits May 26, 2026 09:31

fix: address augment diagnostics findings

cbe5858

fix: address review loop edge cases

c1a3cd8

augmentcode Bot reviewed May 26, 2026

View reviewed changes

Conversation

bma-d commented May 22, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What Changed

Diagnostics contract

State validation and transitions

Parser and verifier contracts

Agent plan and complexity validation

Session/runtime diagnostics

CLI/tmux contract fixes

Docs and plan hygiene

Compatibility Notes

Review Findings Addressed

Verification

Known Gaps / Risks

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Reviews paused

Walkthrough

Changes

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bma-d commented May 22, 2026

Uh oh!

coderabbitai Bot commented May 22, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

bma-d commented May 22, 2026

Uh oh!

augmentcode Bot commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

augmentcode Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bma-d commented May 22, 2026

Uh oh!

coderabbitai Bot commented May 22, 2026

Uh oh!

augmentcode Bot commented May 22, 2026

Uh oh!

bma-d commented May 23, 2026

Uh oh!

augmentcode Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

bma-d commented May 25, 2026

Uh oh!

bma-d commented May 25, 2026

Uh oh!

coderabbitai Bot commented May 25, 2026

bma-d commented May 22, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 22, 2026 •

edited

Loading

augmentcode Bot commented May 22, 2026 •

edited

Loading

augmentcode Bot May 26, 2026 •

edited

Loading

augmentcode Bot May 26, 2026 •

edited

Loading

augmentcode Bot May 26, 2026 •

edited

Loading