Skip to content

Add structured diagnostics and validation clarity#21

Open
bma-d wants to merge 44 commits into
mainfrom
bma-d/e2e-tests
Open

Add structured diagnostics and validation clarity#21
bma-d wants to merge 44 commits into
mainfrom
bma-d/e2e-tests

Conversation

@bma-d
Copy link
Copy Markdown
Collaborator

@bma-d bma-d commented May 22, 2026

Summary

Draft PR for the observability and validation work tracked under issue #5.

This branch makes automator failures easier to diagnose across parser, state, agent-plan, and tmux/session boundaries while preserving successful CLI behavior and legacy output fields. The implementation follows the observability-validation plan: small typed/domain seams, additive diagnostics, focused tests, and no broad OO rewrite.

What Changed

Diagnostics contract

  • Added a shared diagnostics contract with stable serialized issue fields:
    • type
    • field
    • expected
    • actual
    • message
    • recovery
    • code
    • severity
    • source
  • Added diagnostic serialization, legacy-message helpers, exception wrapping, and redaction behavior.
  • Kept command output compatibility by adding structuredIssues only beside legacy fields where applicable.

State validation and transitions

  • Added structured state validation for frontmatter/runtime-policy problems.
  • Preserved existing validate-state fields: ok, structure, and legacy issues strings.
  • Added structuredIssues and issueCount to validate-state.
  • Added allowed status transition checks for state-update.
  • Preserved recovery behavior: invalid legacy statuses can now be repaired to a valid status, while invalid attempted statuses are rejected.

Parser and verifier contracts

  • Moved parse schema validation into a focused contract helper.
  • Added field-path diagnostics for missing keys, wrong nested types, enum mismatches, empty strings, and invalid path/null fields.
  • Preserved successful parse payloads exactly.
  • Preserved legacy parse failure reason values and added structuredIssues on failures.
  • Added structured diagnostics to verifier contract failures where command outputs already carry errors.

Agent plan and complexity validation

  • Added complexity payload and agents-plan payload validation.
  • agents-build now fails early with structured diagnostics for malformed complexity/config payloads.
  • agents-resolve validates only the requested story/task path for direct resolution, preserving compatibility with partial user-authored agents files.
  • Retro agent resolution now routes through shared core agent-config helpers to avoid behavior drift.
  • Existing fallback normalization and retro overrides are preserved.

Session/runtime diagnostics

  • Added diagnostic-aware session-state loading in a focused core/session_state.py module.
  • Preserved legacy load_session_state() behavior returning {} for missing/unreadable/invalid/non-object state.
  • monitor-session --json now reports malformed persisted session-state diagnostics when relevant.
  • Fixed the Claude/auto legacy path so malformed state diagnostics are captured before legacy cleanup can delete the corrupt file.
  • CSV status commands remain unchanged.

CLI/tmux contract fixes

  • Added CLI contract coverage for parse-story, parse-story-range, agent config presets, tmux naming, project scoping, and wrapper behavior.
  • Fixed parse-story-range invalid totals to return stable JSON rather than raising before the error handler.
  • Fixed parse-story error reporting for missing rules versus invalid rules JSON.
  • Hardened agent-config preset file errors.
  • Added hash-scoped tmux session names to avoid same-basename worktree collisions.
  • Fixed tmux-wrapper name --cycle N to use the cycle value, not the flag token.
  • Preserved the legacy positional cycle argument form: tmux-wrapper name review 5 5.3 2.
  • Made kill-all project-scoped by default, with --all-projects as explicit opt-in.

Docs and plan hygiene

  • Updated operator docs for structured diagnostics, monitor-session JSON behavior, tmux session naming, and recovery hints.
  • Fixed crash-recovery hash snippets to use runtime helper commands for project-slug and project-hash.
  • Split older observability handoff entries into an archive file so the active handoff log stays under the repo’s soft file-size guidance while retaining clean-context continuity.

Compatibility Notes

  • validate-state keeps legacy issues: list[str] and adds structuredIssues plus issueCount.
  • Parse success output remains unchanged.
  • Parse failure output keeps legacy status: "error" and reason fields.
  • monitor-session --json preserves existing fields and adds structuredIssues only when relevant.
  • CSV status commands preserve exact output shape.
  • Direct agents-resolve remains compatible with partial plans containing the requested story/task.
  • Legacy tmux positional cycle naming remains supported.

Review Findings Addressed

This branch includes fixes from a staged subagent review loop:

  • Fixed lost malformed session-state diagnostics on the Claude/auto monitor path.
  • Fixed invalid legacy statuses being impossible to repair via state-update.
  • Fixed agents-resolve rejecting otherwise resolvable partial direct agents files.
  • Fixed non-portable crash-recovery project-hash docs.
  • Reduced duplicated retro-agent config resolution by routing through core helpers.
  • Extracted session-state diagnostics from the oversized tmux runtime module.
  • Split the oversized observability handoff log.
  • Fixed dropped legacy positional cycle support for tmux-wrapper name.
  • Removed duplicate monitor-session test coverage.
  • Ensured required new files are tracked.

Latest review baseline: P0/P1 clean.

Verification

Commands run locally:

git diff --check
python3 -m compileall -q skills/bmad-story-automator/src/story_automator
PYTHONPATH=skills/bmad-story-automator/src python3 -m unittest tests.test_state_validation tests.test_agent_plan tests.test_diagnostics_e2e tests.test_tmux_runtime tests.test_retro_agent tests.test_state_policy_metadata tests.test_cli_contracts
PYTHONPATH=skills/bmad-story-automator/src python3 -m unittest discover -s tests
npm run verify

Results:

  • Focused tests passed.
  • Full Python suite passed: 263 tests OK.
  • npm run verify passed.
  • Smoke test passed with known optional bmad-qa-generate-e2e-tests warnings only.

Known Gaps / Risks

  • No live external LLM/tmux integration E2E was added; coverage is local command/fixture based.
  • Some pre-existing modules remain above the soft file-size target, especially core/tmux_runtime.py and core/runtime_policy.py.
  • Smoke test still warns when optional QA generation skills are not installed; this is existing non-blocking behavior.

Refs #5

Summary by CodeRabbit

  • New Features

    • Add structured, field-specific diagnostics alongside legacy error fields; opt-in JSONL diagnostic events
    • Validate and block invalid orchestration status transitions before write
    • Monitor-session JSON mode surfaces persisted session-state diagnostics; session naming gains project scoping
    • Agent-plan/complexity validators with precise field-path errors
  • Bug Fixes

    • Better detection/reporting of malformed persisted session state, parse failures, and agent-plan issues
    • Redaction of sensitive values in legacy error outputs
  • Documentation

    • Expanded observability/validation plans, CLI reference, troubleshooting, and runbooks
  • Tests

    • New and expanded unit and E2E tests for diagnostics, state validation, agent plans, CLI contracts, and monitoring

Review Change Stack

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 22, 2026

Caution

Review failed

Failed to post review comments

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds structured diagnostics primitives, parse-contract/state/agent-plan validation with field-path diagnostics, session-state diagnostic loading, CLI/tmux wiring that preserves legacy fields while adding structuredIssues, monitoring emit helper, agent-config/frontmatter parsing, and extensive unit and E2E tests and documentation.

Changes

Observability and Validation Infrastructure

Layer / File(s) Summary
Diagnostics core
skills/bmad-story-automator/src/story_automator/core/diagnostics.py, tests/test_diagnostics.py
Defines DiagnosticIssue/DiagnosticEvent, serialization, redaction, exception-to-issue helpers, and newline-delimited JSON event emission via STORY_AUTOMATOR_DIAGNOSTICS_FILE.
Parse contract validation
skills/bmad-story-automator/src/story_automator/core/parse_contracts.py, skills/bmad-story-automator/src/story_automator/core/review_verify.py
Adds parse-contract loader/validator, validate_payload, ParseContractError, parse failure payload helpers, and wires standardized verifier exception payloads into verification flows.
State validation & transitions
skills/bmad-story-automator/src/story_automator/core/state_validation.py, skills/bmad-story-automator/src/story_automator/commands/orchestrator_state.py, tests/test_state_validation.py
Centralized state field validation, VALID_STATUSES/ALLOWED_STATUS_TRANSITIONS, status-transition checks, --set parsing helpers, status_transition_error_payload with redaction, and structured/legacy-compatible response shaping.
Session-state diagnostics
skills/bmad-story-automator/src/story_automator/core/session_state.py, skills/bmad-story-automator/src/story_automator/core/tmux_runtime.py, tests/test_tmux_runtime.py
Typed session-state loader with diagnostic categories (missing/unreadable/invalid/schema mismatch); load_session_state() preserves legacy {} behavior; helper to serialize session-state issue for monitor-session --json.
Agent plan validation & resolver
skills/bmad-story-automator/src/story_automator/core/agent_plan.py, tests/test_agent_plan.py
Validates complexity and agents-plan payloads, loaders for embedded JSON blocks, build/resolve APIs, resolution-time validation for (story,task), and standardized agent_plan_error shapes.
AgentConfig frontmatter parser
skills/bmad-story-automator/src/story_automator/core/agent_config_frontmatter.py
Parses agentConfig: frontmatter (top-level only) supporting inline maps and two-space indentation blocks; validates formatting and returns normalized dict.
Agent configuration core
skills/bmad-story-automator/src/story_automator/core/agent_config.py
Centralized agent-config parsing/validation, deterministic frontmatter rendering, AGENT_TASKS/AGENT_COMPLEXITY_LEVELS, and delegation wrappers to agent_plan (including exposure of AgentPlanInputError).
Monitoring emit helper
skills/bmad-story-automator/src/story_automator/core/monitoring.py
Adds emit_monitor_result() to emit session.lifecycle.result events and print JSON or CSV, optionally including structuredIssues.
CLI input & state command wiring
skills/bmad-story-automator/src/story_automator/cli.py, skills/bmad-story-automator/src/story_automator/commands/state.py
Adds file existence/JSON decode checks for parse-story, delegates cmd_validate_state to core validators and returns state_validation_payload, and uses render_agent_config_frontmatter in cmd_build_state_doc.
Agent config command helper
skills/bmad-story-automator/src/story_automator/commands/agent_config_cmd.py
Adds _load_presets_or_report() to standardize preset file read/parse error responses (JSON decode, OSError, UnicodeDecodeError).
Orchestrator CLI wiring & agents/retro refactor
skills/bmad-story-automator/src/story_automator/commands/orchestrator.py, skills/bmad-story-automator/src/story_automator/commands/orchestrator_epic_agents.py
Wires state-update to new state_update_action, standardizes verifier exception payloads, modernizes agents-build/agents-resolve/retro_agent to use core agent_plan/agent_config APIs, and emits policy load/decision events.
Orchestrator parse refactor
skills/bmad-story-automator/src/story_automator/commands/orchestrator_parse.py
Uses shared parse_contracts and diagnostics helpers, standardizes structured failures for parse/sub-agent outputs, and emits parse-stage diagnostic events.
Validate-story creation updates
skills/bmad-story-automator/src/story_automator/commands/validate_story_creation.py
Emits structuredIssues for check failures, redacts override reasons, and annotates field identifiers in error diagnostics.
Tmux wrapper + monitor-session updates
skills/bmad-story-automator/src/story_automator/commands/tmux.py, skills/bmad-story-automator/src/story_automator/core/tmux_runtime.py, skills/bmad-story-automator/data/tmux-commands.md
Adds project_hash to session filtering/naming, documents kill-all --all-projects, refactors cmd_monitor_session to use emit_monitor_result() and precompute session-state issues for JSON output, and simplifies argument parsing.
Monitor verification helpers
skills/bmad-story-automator/src/story_automator/commands/tmux_monitor.py
Adds parse_monitor_* option helpers and verify_monitor_completion to resolve and run success verifiers with structured failure metadata.
Comprehensive tests and E2E
tests/*.py
Adds/extends unit and E2E tests covering diagnostics serialization/redaction, parse contract validation, state validation/transitions, session-state diagnostics, agent-plan validation and CLI flows, retro-agent frontmatter parsing, tmux runtime/session tests, and CLI contract tests.
Docs and operational updates
docs/*, skills/bmad-story-automator/data/*
Adds observability-validation phase plans, CLI reference updates documenting structuredIssues/issueCount, troubleshooting for malformed session state, crash-recovery patterns, handoff logs, implementation notes, and step/workflow docs.
Build/smoke test and tests pkg
scripts/smoke-test.sh, tests/__init__.py
Smoke test packs npm tarball into TMP_DIR and sets PACK_TARBALL accordingly; tests/__init__.py initialized.

Estimated code review effort: 🎯 4 (Complex) | ⏱️ ~50 minutes

"I'm a rabbit who audits logs at night,
I tuck each issue so the output's right,
With hashes, redaction, and structured light,
Diagnostics hop forth—soft and bright. 🐰"

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch bma-d/e2e-tests

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

🧹 Nitpick comments (1)
skills/bmad-story-automator/data/tmux-commands.md (1)

39-70: ⚡ Quick win

Prefer helper-based project-slug/project-hash commands over repeated inline hash scripts.

Line 39-Line 70 duplicates slug/hash derivation logic in multiple snippets. Using helper subcommands in docs reduces drift when naming rules evolve.

Suggested doc patch
-project_slug=$(basename "$PWD" | tr '[:upper:]' '[:lower:]' | tr -cd '[:alnum:]' | cut -c1-8)
-project_hash=$(python3 - <<'PY'
-import hashlib, pathlib
-print(hashlib.md5(str(pathlib.Path.cwd().resolve()).encode(), usedforsecurity=False).hexdigest()[:8])
-PY
-)
+project_slug=$("$scripts" tmux-wrapper project-slug)
+project_hash=$("$scripts" tmux-wrapper project-hash)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@skills/bmad-story-automator/data/tmux-commands.md` around lines 39 - 70,
Replace the repeated inline derivation blocks that set project_slug,
project_hash, and session_name with calls to small helper commands or shell
functions (e.g., a get_project_slug and get_project_hash helper or single
sa-project-id command) and update every snippet that currently contains the
Python/hash and basename pipelines (the blocks that assign project_slug and
project_hash and the session_name usage) to invoke those helpers instead; ensure
the tmux examples (the tmux list-sessions grep and tmux list-sessions -F ... |
xargs kill-session lines) read helpers' output rather than re-duplicating the
scripts so future naming rules change in one place.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@skills/bmad-story-automator/data/crash-recovery.md`:
- Around line 24-28: Add the missing diagnostic key
session_state.unexpected_schema_version to the malformed-state list described
for monitor-session --json so it is treated the same as
session_state.invalid_json, session_state.invalid_type, and
session_state.unreadable (i.e., classify it as a runtime-state diagnostic and
include it in the guidance to verify workflow truth from story files and
sprint-status.yaml before retrying).

In
`@skills/bmad-story-automator/src/story_automator/commands/orchestrator_epic_agents.py`:
- Around line 122-126: The catch-all except maps every build_agents_file failure
to an "invalid_agent_config" pointing at "config-json", which misattributes
failures from state-file, complexity-file, or output; change the except handler
to inspect the exception or context returned by build_agents_file (or re-raise a
custom exception carrying the failing input) and call agent_plan_error with the
correct field (e.g., "state-file", "complexity-file", or "output") and include
the actual failing input path in issues_from_exception so diagnostics preserve
the real failing file; update the logic around build_agents_file,
agent_plan_error, and issues_from_exception to propagate and log the true source
of the error rather than always using "config-json".
- Around line 316-317: The default for "defaultFallback" is currently the string
"false", which can be serialized as a literal value and break resolve_agent();
change the default to a boolean False by using config.get("defaultFallback",
False) and, if needed, normalize incoming values (e.g., convert "false"/"true"
strings to booleans) so resolve_agent() always receives a real boolean or a
proper agent identifier rather than the string "false".

In
`@skills/bmad-story-automator/src/story_automator/commands/orchestrator_parse.py`:
- Around line 41-46: The current except block lumps
FileNotFoundError/json.JSONDecodeError/ValueError/PolicyError into a
parse-contract error with field "parse.schemaPath"; separate those so parse
failures still call print_json(parse_failure_payload("parse_contract_invalid",
issues_from_exception(..., source="parse-contract", field="parse.schemaPath")))
for parsing-related exceptions (FileNotFoundError, json.JSONDecodeError,
ValueError) but handle PolicyError and state/runtime errors coming from
load_runtime_policy and step_contract with their own payloads (e.g.,
source="runtime-policy" or "step-contract" and an appropriate field like
"runtime.policy" or "state.file"), i.e., add distinct except clauses for
PolicyError and any state-related FileNotFoundError raised by
load_runtime_policy/step_contract and call
print_json(parse_failure_payload(...)) with the correct source/field for those
exceptions instead of reusing "parse.schemaPath".

In `@skills/bmad-story-automator/src/story_automator/core/agent_plan.py`:
- Around line 30-33: The code currently uses "complexity =
story.get('complexity') or {}" which coerces falsy but present values (e.g., "",
0, False) into missing; change to "complexity = story.get('complexity')" and
adjust the subsequent check to only validate when complexity is not None (e.g.,
"if complexity is not None and not isinstance(complexity, dict):
issues.append(_issue(...))"), so explicit invalid falsy values trigger the type
diagnostic via the existing _issue call instead of being treated as absent.

In `@skills/bmad-story-automator/src/story_automator/core/diagnostics.py`:
- Around line 80-87: The DiagnosticIssue created in issues_from_exception stores
raw str(exc) into message which bypasses redaction; change issues_from_exception
so that the same redaction/transformation used for DiagnosticIssue.actual is
applied to the message (e.g., call the shared redactor/serializer or helper used
elsewhere when building DiagnosticIssue.actual) so both actual and message
contain redacted output; update the DiagnosticIssue construction in
issues_from_exception to pass the redacted version for message (and ensure
message falls back to the exception class name if redaction yields empty).

In `@skills/bmad-story-automator/src/story_automator/core/session_state.py`:
- Around line 25-28: The except blocks that currently catch (OSError,
json.JSONDecodeError) around the read_text(target) + json.loads calls must also
catch UnicodeDecodeError so text-decoding failures fall back to {} instead of
bubbling up; update the exception tuples (the try that assigns raw =
json.loads(read_text(target)) and the similar try/except later) to include
UnicodeDecodeError (or add a separate except UnicodeDecodeError) so read_text,
json.loads and the target handling in session_state.py handle decoding errors
the same as unreadable files.

In `@skills/bmad-story-automator/src/story_automator/core/state_validation.py`:
- Around line 115-119: The function has_runtime_command_config treats
whitespace-only strings as valid; update it so that string values are accepted
only if ai_command is a non-empty string after stripping whitespace, and for
list values accept only if the list contains at least one non-empty (after
strip) string element; otherwise fall back to calling
_has_agent_config_block(frontmatter). Modify has_runtime_command_config to
perform isinstance checks for str and list and use .strip() on string elements
to reject whitespace-only entries.

In `@skills/bmad-story-automator/steps-c/step-02b-preflight-finalize.md`:
- Around line 77-82: The two separate date -u calls can yield inconsistent
timestamps; capture a single timestamp into a variable (e.g., TS_NOW) and reuse
it for the orchestrator-helper state-update --set lastUpdated and the echo
append to "{outputFile}" so both entries use the exact same ISO UTC time; update
the commands around the state-update invocation (state-update) and the echo line
that appends to "{outputFile}" to reference that variable instead of calling
date twice.

---

Nitpick comments:
In `@skills/bmad-story-automator/data/tmux-commands.md`:
- Around line 39-70: Replace the repeated inline derivation blocks that set
project_slug, project_hash, and session_name with calls to small helper commands
or shell functions (e.g., a get_project_slug and get_project_hash helper or
single sa-project-id command) and update every snippet that currently contains
the Python/hash and basename pipelines (the blocks that assign project_slug and
project_hash and the session_name usage) to invoke those helpers instead; ensure
the tmux examples (the tmux list-sessions grep and tmux list-sessions -F ... |
xargs kill-session lines) read helpers' output rather than re-duplicating the
scripts so future naming rules change in one place.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: aac36e51-5224-48f0-bef9-b237f2f801ff

📥 Commits

Reviewing files that changed from the base of the PR and between 33601b9 and 0f3e547.

📒 Files selected for processing (46)
  • docs/agents-and-monitoring.md
  • docs/cli-reference.md
  • docs/how-it-works.md
  • docs/plans/observability-validation/00-baseline-and-plan-reconciliation.md
  • docs/plans/observability-validation/01-diagnostics-contract.md
  • docs/plans/observability-validation/02-state-validation-and-transitions.md
  • docs/plans/observability-validation/03-parser-and-contract-boundaries.md
  • docs/plans/observability-validation/04-agent-complexity-and-story-boundaries.md
  • docs/plans/observability-validation/05-session-runtime-diagnostics.md
  • docs/plans/observability-validation/06-e2e-docs-and-release-readiness.md
  • docs/plans/observability-validation/README.md
  • docs/plans/observability-validation/TODO.md
  • docs/plans/observability-validation/handoff-log-archive-phase-00-04.md
  • docs/plans/observability-validation/handoff-log.md
  • docs/plans/observability-validation/implementation-notes.md
  • docs/state-and-resume.md
  • docs/troubleshooting.md
  • scripts/smoke-test.sh
  • skills/bmad-story-automator/data/crash-recovery.md
  • skills/bmad-story-automator/data/monitoring-pattern.md
  • skills/bmad-story-automator/data/tmux-commands.md
  • skills/bmad-story-automator/src/story_automator/cli.py
  • skills/bmad-story-automator/src/story_automator/commands/agent_config_cmd.py
  • skills/bmad-story-automator/src/story_automator/commands/orchestrator.py
  • skills/bmad-story-automator/src/story_automator/commands/orchestrator_epic_agents.py
  • skills/bmad-story-automator/src/story_automator/commands/orchestrator_parse.py
  • skills/bmad-story-automator/src/story_automator/commands/state.py
  • skills/bmad-story-automator/src/story_automator/commands/tmux.py
  • skills/bmad-story-automator/src/story_automator/core/agent_plan.py
  • skills/bmad-story-automator/src/story_automator/core/diagnostics.py
  • skills/bmad-story-automator/src/story_automator/core/parse_contracts.py
  • skills/bmad-story-automator/src/story_automator/core/review_verify.py
  • skills/bmad-story-automator/src/story_automator/core/session_state.py
  • skills/bmad-story-automator/src/story_automator/core/state_validation.py
  • skills/bmad-story-automator/src/story_automator/core/tmux_runtime.py
  • skills/bmad-story-automator/steps-c/step-02b-preflight-finalize.md
  • skills/bmad-story-automator/steps-v/step-v-01-check.md
  • tests/__init__.py
  • tests/test_agent_plan.py
  • tests/test_cli_contracts.py
  • tests/test_diagnostics.py
  • tests/test_diagnostics_e2e.py
  • tests/test_orchestrator_parse.py
  • tests/test_state_validation.py
  • tests/test_success_verifiers.py
  • tests/test_tmux_runtime.py

Comment thread skills/bmad-story-automator/data/crash-recovery.md
Comment thread skills/bmad-story-automator/src/story_automator/core/agent_plan.py Outdated
Comment thread skills/bmad-story-automator/src/story_automator/core/diagnostics.py
Comment thread skills/bmad-story-automator/src/story_automator/core/state_validation.py Outdated
Comment thread skills/bmad-story-automator/steps-c/step-02b-preflight-finalize.md
@bma-d
Copy link
Copy Markdown
Collaborator Author

bma-d commented May 22, 2026

@coderabbitai review please

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 22, 2026

@bma-d Sure! I'll start a full review of PR #21 right away.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
skills/bmad-story-automator/src/story_automator/commands/orchestrator_parse.py (1)

38-59: 💤 Low value

Message-based error routing may be fragile.

The routing at lines 40-41 depends on specific substrings in exception messages ("parse schema" or "policy data file missing"). If the exception message wording changes upstream, the error will be misclassified as runtime_policy_invalid instead of parse_contract_invalid.

Consider using distinct exception types or error codes rather than message inspection if the upstream module can be modified. However, if this is a deliberate pragmatic choice and the upstream messages are stable, this is acceptable.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@skills/bmad-story-automator/src/story_automator/commands/orchestrator_parse.py`
around lines 38 - 59, The current PolicyError handling in orchestrator_parse.py
relies on fragile substring checks of str(exc) to decide between
parse_contract_invalid and runtime_policy_invalid; replace this brittle logic by
using a concrete discriminator on the exception (e.g. an explicit exception
subclass or a stable error code/attribute) instead: either have the upstream
raise/attach a ParseSchemaError or an error.code like
"parse_schema"/"policy_data_missing", or add a local classify_policy_error(exc)
helper that inspects exc.code or isinstance(exc, ParseSchemaError) and returns
the correct category; then route to print_json(parse_failure_payload(...)) using
that discriminator (keeping the same issues_from_exception(...) usage) so
decision logic no longer depends on message text.
skills/bmad-story-automator/src/story_automator/core/agent_plan.py (1)

229-230: 💤 Low value

Unused call to normalize_fallback_value.

The call on line 230 has no side effect and its return value is discarded. normalize_fallback_value performs string normalization but doesn't validate or raise exceptions, so this call serves no purpose.

Suggested fix
     if not (fallback is False or isinstance(fallback, str)):
         issues.append(_issue("invalid_type", f"{task_field}.fallback", "false or string", fallback, f"{task} fallback must be false or a string"))
-    elif isinstance(fallback, str):
-        normalize_fallback_value(fallback)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@skills/bmad-story-automator/src/story_automator/core/agent_plan.py` around
lines 229 - 230, The call to normalize_fallback_value(fallback) is unused
because its return value is discarded; either remove the call or use its result.
Update the branch handling (the elif that checks isinstance(fallback, str)) to
assign the normalized string back (e.g., fallback =
normalize_fallback_value(fallback)) or otherwise pass the normalized value into
the downstream code that consumes fallback so the normalization has an effect;
ensure you reference normalize_fallback_value and the local variable fallback in
your change.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In
`@skills/bmad-story-automator/src/story_automator/commands/orchestrator_parse.py`:
- Around line 38-59: The current PolicyError handling in orchestrator_parse.py
relies on fragile substring checks of str(exc) to decide between
parse_contract_invalid and runtime_policy_invalid; replace this brittle logic by
using a concrete discriminator on the exception (e.g. an explicit exception
subclass or a stable error code/attribute) instead: either have the upstream
raise/attach a ParseSchemaError or an error.code like
"parse_schema"/"policy_data_missing", or add a local classify_policy_error(exc)
helper that inspects exc.code or isinstance(exc, ParseSchemaError) and returns
the correct category; then route to print_json(parse_failure_payload(...)) using
that discriminator (keeping the same issues_from_exception(...) usage) so
decision logic no longer depends on message text.

In `@skills/bmad-story-automator/src/story_automator/core/agent_plan.py`:
- Around line 229-230: The call to normalize_fallback_value(fallback) is unused
because its return value is discarded; either remove the call or use its result.
Update the branch handling (the elif that checks isinstance(fallback, str)) to
assign the normalized string back (e.g., fallback =
normalize_fallback_value(fallback)) or otherwise pass the normalized value into
the downstream code that consumes fallback so the normalization has an effect;
ensure you reference normalize_fallback_value and the local variable fallback in
your change.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: da7decd1-ae17-479a-8468-dc05dcfb3ad8

📥 Commits

Reviewing files that changed from the base of the PR and between 0f3e547 and 385c22a.

📒 Files selected for processing (20)
  • skills/bmad-story-automator/data/crash-recovery.md
  • skills/bmad-story-automator/data/tmux-commands.md
  • skills/bmad-story-automator/src/story_automator/commands/orchestrator_epic_agents.py
  • skills/bmad-story-automator/src/story_automator/commands/orchestrator_parse.py
  • skills/bmad-story-automator/src/story_automator/commands/state.py
  • skills/bmad-story-automator/src/story_automator/commands/tmux.py
  • skills/bmad-story-automator/src/story_automator/core/agent_config.py
  • skills/bmad-story-automator/src/story_automator/core/agent_config_frontmatter.py
  • skills/bmad-story-automator/src/story_automator/core/agent_plan.py
  • skills/bmad-story-automator/src/story_automator/core/diagnostics.py
  • skills/bmad-story-automator/src/story_automator/core/monitoring.py
  • skills/bmad-story-automator/src/story_automator/core/session_state.py
  • skills/bmad-story-automator/src/story_automator/core/state_validation.py
  • skills/bmad-story-automator/steps-c/step-02b-preflight-finalize.md
  • tests/test_agent_plan.py
  • tests/test_diagnostics.py
  • tests/test_orchestrator_parse.py
  • tests/test_retro_agent.py
  • tests/test_state_validation.py
  • tests/test_tmux_runtime.py
✅ Files skipped from review due to trivial changes (1)
  • skills/bmad-story-automator/data/crash-recovery.md

@bma-d bma-d marked this pull request as ready for review May 22, 2026 10:54
@bma-d
Copy link
Copy Markdown
Collaborator Author

bma-d commented May 22, 2026

@augmentcode review please

@augmentcode
Copy link
Copy Markdown

augmentcode Bot commented May 22, 2026

🤖 Augment PR Summary

Summary: This PR adds structured, field-addressable diagnostics across the story automator's parser, state, agent-plan, and session boundaries while preserving backward-compatible CLI output. Tracked under issue #5.

Changes:

  • Added a shared diagnostics contract (DiagnosticIssue/DiagnosticEvent) with serialization, redaction, and opt-in JSONL event emission via STORY_AUTOMATOR_DIAGNOSTICS_FILE
  • Added structured state validation with field-specific diagnostics and allowed status transition guards for state-update
  • Added parse contract schema leaf validation before sub-agent execution, with field-path diagnostics for malformed parse payloads
  • Added complexity and agents-plan payload validators; agents-build/agents-resolve now fail early with structuredIssues
  • Added diagnostic-aware session-state loading; monitor-session --json surfaces structuredIssues for malformed state
  • Extracted orchestrator_state.py, tmux_monitor.py, session_state.py, agent_config_frontmatter.py, agent_plan.py, parse_contracts.py, orchestration_events.py, and monitoring.py as focused modules
  • Restricted frontmatter replacements to the frontmatter section only; restored kill-all default to all sessions; fixed --cycle flag parsing
  • Added ~1600 lines of new tests across 7 test modules covering diagnostics, state validation, agent plans, CLI contracts, and E2E scenarios

Technical Notes: All changes are additive—legacy fields (issues, reason, error) are preserved alongside new structuredIssues. Successful parse and CSV outputs remain unchanged. 263→310+ tests pass locally with npm run verify.

🤖 Was this summary useful? React with 👍 or 👎

Copy link
Copy Markdown

@augmentcode augmentcode Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 3 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

Comment thread skills/bmad-story-automator/src/story_automator/core/parse_contracts.py Outdated
Comment thread skills/bmad-story-automator/src/story_automator/commands/state.py Outdated
Comment thread skills/bmad-story-automator/src/story_automator/cli.py
@bma-d
Copy link
Copy Markdown
Collaborator Author

bma-d commented May 22, 2026

@augmentcode review please
@coderabbitai review please

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 22, 2026

@bma-d Sure! I'll kick off a full review of PR #21 right away.

✅ Actions performed

Full review triggered.

@augmentcode
Copy link
Copy Markdown

augmentcode Bot commented May 22, 2026

This pull request is abnormally large and would use a significant amount of tokens to review. If you still wish to review it, comment "augment review" and we will review it.

@bma-d
Copy link
Copy Markdown
Collaborator Author

bma-d commented May 23, 2026

augment review

Copy link
Copy Markdown

@augmentcode augmentcode Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 5 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

Comment thread skills/bmad-story-automator/src/story_automator/core/parse_contracts.py Outdated
Comment thread skills/bmad-story-automator/src/story_automator/core/diagnostics.py Outdated
Comment thread skills/bmad-story-automator/src/story_automator/core/agent_plan.py Outdated
Comment thread skills/bmad-story-automator/src/story_automator/core/state_validation.py Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
skills/bmad-story-automator/src/story_automator/core/diagnostics.py (1)

65-73: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Redact event-level message before serialization.

Line 69 writes event.message unredacted. If any caller passes dynamic error text, this can leak sensitive/path data to diagnostics logs.

🔧 Proposed fix
 def serialize_event(event: DiagnosticEvent) -> dict[str, Any]:
     return {
         "name": event.name,
         "source": event.source,
-        "message": event.message,
+        "message": redact_actual(event.message),
         "severity": event.severity,
         "issues": serialize_issues(event.issues),
         "context": redact_actual(event.context),
     }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@skills/bmad-story-automator/src/story_automator/core/diagnostics.py` around
lines 65 - 73, serialize_event currently includes event.message unredacted which
can leak sensitive data; update serialize_event (in the serialize_event
function) to pass the message through the existing redact_actual function (e.g.,
use redact_actual(event.message)) before putting it in the returned dict,
ensuring the "message" field is redacted like "context" and keeping other fields
(name, source, severity, issues) unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/test_diagnostics.py`:
- Around line 4-5: Add an explicit import for the unittest.mock module so the
test file is self-contained; specifically, at the top of
tests/test_diagnostics.py add "import unittest.mock" (or otherwise ensure
unittest.mock is imported) because later code calls
unittest.mock.patch.dict(...) — update the imports block that currently contains
"import tempfile" and "import unittest" to include the explicit unittest.mock
import.

---

Outside diff comments:
In `@skills/bmad-story-automator/src/story_automator/core/diagnostics.py`:
- Around line 65-73: serialize_event currently includes event.message unredacted
which can leak sensitive data; update serialize_event (in the serialize_event
function) to pass the message through the existing redact_actual function (e.g.,
use redact_actual(event.message)) before putting it in the returned dict,
ensuring the "message" field is redacted like "context" and keeping other fields
(name, source, severity, issues) unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 65372596-1476-4c8f-9f61-ae663e010077

📥 Commits

Reviewing files that changed from the base of the PR and between 385c22a and bf2740a.

📒 Files selected for processing (31)
  • docs/cli-reference.md
  • docs/plans/observability-validation/07-review-remediation.md
  • docs/plans/observability-validation/08-diagnostic-redaction-completion.md
  • docs/plans/observability-validation/README.md
  • docs/plans/observability-validation/TODO.md
  • docs/plans/observability-validation/TODO/phase-08.md
  • docs/plans/observability-validation/gate-map.md
  • docs/plans/observability-validation/handoff-log.md
  • docs/plans/observability-validation/implementation-notes.md
  • skills/bmad-story-automator/src/story_automator/cli.py
  • skills/bmad-story-automator/src/story_automator/commands/orchestrator.py
  • skills/bmad-story-automator/src/story_automator/commands/orchestrator_epic_agents.py
  • skills/bmad-story-automator/src/story_automator/commands/orchestrator_parse.py
  • skills/bmad-story-automator/src/story_automator/commands/state.py
  • skills/bmad-story-automator/src/story_automator/commands/tmux.py
  • skills/bmad-story-automator/src/story_automator/commands/validate_story_creation.py
  • skills/bmad-story-automator/src/story_automator/core/agent_plan.py
  • skills/bmad-story-automator/src/story_automator/core/diagnostics.py
  • skills/bmad-story-automator/src/story_automator/core/monitoring.py
  • skills/bmad-story-automator/src/story_automator/core/orchestration_events.py
  • skills/bmad-story-automator/src/story_automator/core/parse_contracts.py
  • skills/bmad-story-automator/src/story_automator/core/runtime_policy.py
  • skills/bmad-story-automator/src/story_automator/core/state_validation.py
  • tests/test_agent_plan.py
  • tests/test_cli_contracts.py
  • tests/test_diagnostics.py
  • tests/test_diagnostics_e2e.py
  • tests/test_orchestrator_parse.py
  • tests/test_state_policy_metadata.py
  • tests/test_state_validation.py
  • tests/test_success_verifiers.py
✅ Files skipped from review due to trivial changes (6)
  • docs/plans/observability-validation/07-review-remediation.md
  • docs/plans/observability-validation/08-diagnostic-redaction-completion.md
  • docs/plans/observability-validation/gate-map.md
  • docs/plans/observability-validation/README.md
  • docs/plans/observability-validation/TODO.md
  • docs/plans/observability-validation/implementation-notes.md

Comment thread tests/test_diagnostics.py
@bma-d bma-d force-pushed the bma-d/e2e-tests branch from fddf414 to cb3f5dd Compare May 25, 2026 09:21
@bma-d
Copy link
Copy Markdown
Collaborator Author

bma-d commented May 25, 2026

augment review

@bma-d
Copy link
Copy Markdown
Collaborator Author

bma-d commented May 25, 2026

@coderabbitai review please

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 25, 2026

@bma-d Sure! I'll kick off a full review of PR #21 right away.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@skills/bmad-story-automator/src/story_automator/commands/agent_config_cmd.py`:
- Around line 96-104: The wrapper _load_presets_or_report currently catches
json.JSONDecodeError and OSError but misses UnicodeDecodeError (a separate
exception) so UTF-8 decode failures aren’t returned as the structured
presets_file_error; update _load_presets_or_report to catch UnicodeDecodeError
(e.g., add an except UnicodeDecodeError as exc:) and call print_json({"ok":
False, "error": "presets_file_error", "reason": str(exc)}) and return None (or
alternatively combine it with the OSError handling) so decode errors from
load_presets_file are reported consistently.

In `@skills/bmad-story-automator/src/story_automator/commands/orchestrator.py`:
- Around line 305-308: The current re.subn call (using
re.subn(rf"(?m)^{re.escape(key)}:.*$", ... ) on variable text) updates the
entire markdown, causing body lines to be overwritten; restrict replacements to
the YAML frontmatter only by extracting the frontmatter block (delimited by
leading and trailing '---' or '---\\n'), run the re.subn and updated.append(key)
against that frontmatter string (using the existing key/value logic and
re.subn), then reassemble text = frontmatter + rest_of_document; update
references to text, key, value, updated and the re.subn call in orchestrator.py
so only the frontmatter is modified.

In
`@skills/bmad-story-automator/src/story_automator/core/agent_config_frontmatter.py`:
- Around line 10-12: The code currently uses stripped = raw_line.strip() then
stripped.startswith("agentConfig:"), which treats indented keys like " 
agentConfig:" as top-level; change the detection to be left-whitespace sensitive
by checking raw_line.startswith("agentConfig:") (i.e., do not remove leading
whitespace) before calling _extract_agent_config_block(frontmatter.splitlines(),
index), so only truly top-level unindented "agentConfig:" lines trigger
extraction; update the raw_line handling accordingly while keeping the call to
_extract_agent_config_block with frontmatter and index.
- Around line 38-45: The loop over enumerate(lines) in
agent_config_frontmatter.py allows lines like "  \tkey: value" because it only
checks for tabs at line start earlier; update the loop that calls
_strip_inline_yaml_comment and computes indent to also reject any tabs in the
indentation portion (or anywhere in the line after stripping inline comments).
Concretely, after computing indent = len(line) - len(line.lstrip(" ")), assert
that line[:indent] contains only spaces (no '\t') and/or if '\t' in line: raise
the same ValueError (or a clearer one) so lines with tabs after spaces are
rejected consistently by this function.

In `@skills/bmad-story-automator/src/story_automator/core/diagnostics.py`:
- Around line 21-25: The current regexes ABSOLUTE_PATH_WITH_EXT_RE and
ABSOLUTE_PATH_RE only match Unix-style paths and must be extended to also detect
Windows absolute paths (e.g., "C:\Users\..." or "C:/Users/...") to avoid leaking
local Windows paths; update the definitions for ABSOLUTE_PATH_WITH_EXT_RE and
ABSOLUTE_PATH_RE to include alternative branches that match a drive letter plus
colon and backslash/forward slash patterns and ensure backslashes are correctly
escaped in the raw string, then apply the same updated regexes where those
symbols are redefined/used later (the other occurrence referenced in the
comment) so both the with-extension and generic absolute-path redaction cover
Windows paths.

In `@skills/bmad-story-automator/src/story_automator/core/state_validation.py`:
- Around line 145-148: The parsed "--set" handler currently returns the raw key
which allows whitespace-padded keys to bypass validation; update the block
around "key, value = raw.split('=', 1)" to normalize the key by calling key =
key.strip() (and optionally strip value as well) before returning so the
function (and downstream validation in state transition checks) receives a
trimmed key; keep the existing error branch that calls
state_update_argument_error_payload(raw).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d9bbb554-a626-414a-8257-d556cc99fa2e

📥 Commits

Reviewing files that changed from the base of the PR and between bf2740a and cb3f5dd.

📒 Files selected for processing (58)
  • docs/agents-and-monitoring.md
  • docs/cli-reference.md
  • docs/how-it-works.md
  • docs/plans/observability-validation/00-baseline-and-plan-reconciliation.md
  • docs/plans/observability-validation/01-diagnostics-contract.md
  • docs/plans/observability-validation/02-state-validation-and-transitions.md
  • docs/plans/observability-validation/03-parser-and-contract-boundaries.md
  • docs/plans/observability-validation/04-agent-complexity-and-story-boundaries.md
  • docs/plans/observability-validation/05-session-runtime-diagnostics.md
  • docs/plans/observability-validation/06-e2e-docs-and-release-readiness.md
  • docs/plans/observability-validation/07-review-remediation.md
  • docs/plans/observability-validation/08-diagnostic-redaction-completion.md
  • docs/plans/observability-validation/README.md
  • docs/plans/observability-validation/TODO.md
  • docs/plans/observability-validation/TODO/phase-08.md
  • docs/plans/observability-validation/gate-map.md
  • docs/plans/observability-validation/handoff-log-archive-phase-00-04.md
  • docs/plans/observability-validation/handoff-log.md
  • docs/plans/observability-validation/implementation-notes.md
  • docs/state-and-resume.md
  • docs/troubleshooting.md
  • scripts/smoke-test.sh
  • skills/bmad-story-automator/data/crash-recovery.md
  • skills/bmad-story-automator/data/monitoring-pattern.md
  • skills/bmad-story-automator/data/tmux-commands.md
  • skills/bmad-story-automator/src/story_automator/cli.py
  • skills/bmad-story-automator/src/story_automator/commands/agent_config_cmd.py
  • skills/bmad-story-automator/src/story_automator/commands/orchestrator.py
  • skills/bmad-story-automator/src/story_automator/commands/orchestrator_epic_agents.py
  • skills/bmad-story-automator/src/story_automator/commands/orchestrator_parse.py
  • skills/bmad-story-automator/src/story_automator/commands/state.py
  • skills/bmad-story-automator/src/story_automator/commands/tmux.py
  • skills/bmad-story-automator/src/story_automator/commands/validate_story_creation.py
  • skills/bmad-story-automator/src/story_automator/core/agent_config.py
  • skills/bmad-story-automator/src/story_automator/core/agent_config_frontmatter.py
  • skills/bmad-story-automator/src/story_automator/core/agent_plan.py
  • skills/bmad-story-automator/src/story_automator/core/diagnostics.py
  • skills/bmad-story-automator/src/story_automator/core/monitoring.py
  • skills/bmad-story-automator/src/story_automator/core/orchestration_events.py
  • skills/bmad-story-automator/src/story_automator/core/parse_contracts.py
  • skills/bmad-story-automator/src/story_automator/core/review_verify.py
  • skills/bmad-story-automator/src/story_automator/core/runtime_policy.py
  • skills/bmad-story-automator/src/story_automator/core/session_state.py
  • skills/bmad-story-automator/src/story_automator/core/state_validation.py
  • skills/bmad-story-automator/src/story_automator/core/tmux_runtime.py
  • skills/bmad-story-automator/steps-c/step-02b-preflight-finalize.md
  • skills/bmad-story-automator/steps-v/step-v-01-check.md
  • tests/__init__.py
  • tests/test_agent_plan.py
  • tests/test_cli_contracts.py
  • tests/test_diagnostics.py
  • tests/test_diagnostics_e2e.py
  • tests/test_orchestrator_parse.py
  • tests/test_retro_agent.py
  • tests/test_state_policy_metadata.py
  • tests/test_state_validation.py
  • tests/test_success_verifiers.py
  • tests/test_tmux_runtime.py
✅ Files skipped from review due to trivial changes (18)
  • docs/plans/observability-validation/gate-map.md
  • docs/state-and-resume.md
  • skills/bmad-story-automator/data/monitoring-pattern.md
  • docs/plans/observability-validation/03-parser-and-contract-boundaries.md
  • docs/agents-and-monitoring.md
  • docs/plans/observability-validation/08-diagnostic-redaction-completion.md
  • docs/plans/observability-validation/00-baseline-and-plan-reconciliation.md
  • docs/plans/observability-validation/07-review-remediation.md
  • docs/cli-reference.md
  • docs/plans/observability-validation/README.md
  • docs/troubleshooting.md
  • tests/init.py
  • docs/how-it-works.md
  • docs/plans/observability-validation/06-e2e-docs-and-release-readiness.md
  • skills/bmad-story-automator/data/crash-recovery.md
  • docs/plans/observability-validation/04-agent-complexity-and-story-boundaries.md
  • docs/plans/observability-validation/02-state-validation-and-transitions.md
  • docs/plans/observability-validation/TODO.md

Comment thread skills/bmad-story-automator/src/story_automator/commands/orchestrator.py Outdated
Comment thread skills/bmad-story-automator/src/story_automator/core/agent_config_frontmatter.py Outdated
Comment thread skills/bmad-story-automator/src/story_automator/core/diagnostics.py
Comment thread skills/bmad-story-automator/src/story_automator/core/state_validation.py Outdated
@bma-d
Copy link
Copy Markdown
Collaborator Author

bma-d commented May 25, 2026

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 25, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@bma-d
Copy link
Copy Markdown
Collaborator Author

bma-d commented May 25, 2026

@augmentcode review

Copy link
Copy Markdown

@augmentcode augmentcode Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 5 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

Comment thread skills/bmad-story-automator/src/story_automator/core/state_validation.py Outdated
Comment thread skills/bmad-story-automator/src/story_automator/core/agent_plan.py
Comment thread skills/bmad-story-automator/src/story_automator/commands/tmux.py Outdated
@bma-d bma-d requested a review from alexeyv May 26, 2026 08:36
@bma-d
Copy link
Copy Markdown
Collaborator Author

bma-d commented May 26, 2026

@augmentcode review
@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 26, 2026

@bma-d Sure, I'll kick off a review of PR #21 right away!

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown

@augmentcode augmentcode Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 5 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

Comment thread skills/bmad-story-automator/src/story_automator/core/agent_plan.py
@bma-d
Copy link
Copy Markdown
Collaborator Author

bma-d commented May 26, 2026

@augmentcode review
@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 26, 2026

@bma-d Sure! I'll kick off a review of PR #21 right away.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown

@augmentcode augmentcode Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 6 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

print_json(updates)
return 1

pending_status = str(fields.get("status") or "")
Copy link
Copy Markdown

@augmentcode augmentcode Bot May 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parse_simple_frontmatter reads from the full document (not just the frontmatter section), so fields.get("status") could pick up a status: line from the body if the body happens to start with one. Consider parsing from the split frontmatter instead of the full text to stay consistent with the replacement scope restricted by _split_frontmatter.

Severity: medium

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

complexity, issue = _story_complexity(story, field)
if issue:
raise AgentPlanInputError("complexity-file", ValueError(legacy_issue_message(issue)))
return str(complexity.get("level") or "medium").strip().lower() or "medium"
Copy link
Copy Markdown

@augmentcode augmentcode Bot May 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

complexity.get("level") or "medium" coerces explicit falsy level values like "" or 0 to "medium" silently, bypassing the COMPLEXITY_LEVELS check. The preceding validate_complexity_payload does the same coercion on line 43 (or "medium"), so an empty-string level passes validation and defaults to medium without any diagnostic. Consider rejecting empty-string levels explicitly in the validator.

Severity: medium

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

def verifier_exception_payload(reason: str, exc: Exception, *, source: str, field: str = "", **extra: object) -> dict[str, object]:
issues = issues_from_exception(exc, source=source, field=field)
redacted_extra = redact_actual(extra)
return {"verified": False, "reason": reason, "error": redact_actual(str(exc)), **redacted_extra, "structuredIssues": serialize_issues(issues)}
Copy link
Copy Markdown

@augmentcode augmentcode Bot May 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The **redacted_extra spread can silently overwrite the preceding "reason" or "error" keys if the caller passes reason= or error= as extra kwargs, since dict unpacking resolves left-to-right. Consider placing **redacted_extra before the explicit keys, or validating that extra doesn't contain reserved field names.

Severity: medium

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

start = time.time()
last_done = 0
last_total = 0
session_state_issue = monitor_session_state_issue(session, project_root) if json_output else None
Copy link
Copy Markdown

@augmentcode augmentcode Bot May 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

monitor_session_state_issue(session, project_root) reads and parses the session state file eagerly before the poll loop begins. If the file doesn't exist yet (runner hasn't created it), this always returns None, and on the not_found path at line 443 it re-checks. However, if the state file was valid at poll start and becomes corrupted mid-loop, the pre-cached None is used and the re-check at line 443 catches it. This is fine, but the pre-poll I/O is wasted for sessions that complete/crash/stuck without reaching not_found.

Severity: low

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

raw_value = _strip_inline_yaml_comment(raw_value)
if raw_value:
parsed = _parse_scalar(raw_value)
return parsed if isinstance(parsed, dict) else {"agentConfig": parsed}
Copy link
Copy Markdown

@augmentcode augmentcode Bot May 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When agentConfig: has a non-dict inline value (e.g. agentConfig: bad), this returns {"agentConfig": "bad"} which wraps the value under an agentConfig key. Downstream parse_agent_config_json will then see a nested agentConfig key and raise "unexpected nested agentConfig key", which is a confusing error message for what is actually a scalar value on the header line.

Severity: low

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

key, value = raw.split("=", 1)
if not key.strip():
return state_update_argument_error_payload(raw)
return key.strip(), value.strip()
Copy link
Copy Markdown

@augmentcode augmentcode Bot May 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The value is .strip()-ped here, but this means --set status= IN_PROGRESS writes status: IN_PROGRESS to frontmatter (stripped). While the key stripping is clearly necessary, stripping the value changes the semantics — a value like --set customInstructions= keep leading spaces would lose its intentional leading/trailing whitespace. Consider stripping only for status transitions and preserving the raw value for non-status updates.

Severity: low

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant