Skip to content

Add RecordingSubprocessRunner for multi-session scenario recording #672

@Trecek

Description

@Trecek

Summary

Add a RecordingSubprocessRunner that wraps DefaultSubprocessRunner and records each headless session as a numbered cassette using api-simulator's ScenarioRecorder. This is the AutoSkillit-side adapter for the multi-session recording infrastructure added in TalonT-Org/api-simulator#81 and TalonT-Org/api-simulator#82.

Background: api-simulator Scenario Types

The api-simulator package (post-merge of #81#86) exposes these types from api_simulator.claude:

from api_simulator.claude import ScenarioRecorder, ScenarioPlayer, ScenarioBuilder

# ScenarioRecorder — wraps CassetteRecorder for multi-session recording
recorder = ScenarioRecorder(output_dir="/path/to/scenario", recipe_name="smoke-test")

# Record a headless session (spawns real subprocess under PTY, captures NDJSON)
step_result = recorder.record_step(
    step_name="investigate",      # dispatch key for replay
    tool="run_skill",             # tool type (run_skill, run_cmd, test_check, etc.)
    cmd="claude",                 # command
    args=["--print", "...", "--output-format", "stream-json"],
)
# step_result has: exit_code, stdout, duration_ms, session_dir

# Record a non-session step (run_cmd, test_check, classify_fix — no subprocess capture)
recorder.record_non_session_step(
    step_name="test",
    tool="test_check",
    result_summary={"exit_code": 1, "stderr": "FAILED test_feature.py"},
)

# Finalize — writes scenario.json manifest
scenario = recorder.finalize()

Each record_step call creates a cassette subdirectory: sessions/001_investigate/, sessions/002_implement/, etc. Each contains stdout.jsonl, metadata.json, input.json, session_log.jsonl.

Scope

1. RecordingSubprocessRunner class

Location: src/autoskillit/execution/recording.py (new file)

Wraps DefaultSubprocessRunner. On each call:

  1. Determines whether this is a run_skill call (check for claude in cmd or pty_mode=True)
  2. For run_skill calls: delegates to ScenarioRecorder.record_step() instead of the inner runner, then constructs a SubprocessResult from the step result
  3. For non-session calls (run_cmd, test_check): delegates to the inner DefaultSubprocessRunner, then records the result summary via recorder.record_non_session_step()
  4. Injects SCENARIO_STEP_NAME=<step> into the subprocess environment (used for replay dispatch keying)
from autoskillit.core.types import SubprocessRunner, SubprocessResult, TerminationReason
from autoskillit.execution.process import DefaultSubprocessRunner
from api_simulator.claude import ScenarioRecorder

class RecordingSubprocessRunner:
    """Wraps DefaultSubprocessRunner, records each session via ScenarioRecorder."""

    def __init__(self, recorder: ScenarioRecorder, inner: SubprocessRunner | None = None):
        self._recorder = recorder
        self._inner = inner or DefaultSubprocessRunner()

    async def __call__(
        self,
        cmd: list[str],
        *,
        cwd: Path,
        timeout: float,
        env: dict[str, str] | None = None,
        stale_threshold: float = 1200,
        completion_marker: str = "",
        session_log_dir: Path | None = None,
        pty_mode: bool = False,
        input_data: str | None = None,
        completion_drain_timeout: float = 5.0,
        linux_tracing_config: Any | None = None,
    ) -> SubprocessResult:
        # Extract step_name from env or infer from command
        step_name = (env or {}).get("SCENARIO_STEP_NAME", "")

        if pty_mode and step_name:
            # This is a run_skill call — record via ScenarioRecorder
            step_result = self._recorder.record_step(
                step_name=step_name,
                tool="run_skill",
                cmd=cmd[0],
                args=cmd[1:],
            )
            return SubprocessResult(
                returncode=step_result.exit_code,
                stdout=step_result.stdout,
                stderr="",
                termination=TerminationReason.NATURAL_EXIT,
                pid=0,
            )
        else:
            # Non-session call — delegate to inner runner, record summary
            result = await self._inner(
                cmd, cwd=cwd, timeout=timeout, env=env,
                stale_threshold=stale_threshold,
                completion_marker=completion_marker,
                session_log_dir=session_log_dir,
                pty_mode=pty_mode, input_data=input_data,
                completion_drain_timeout=completion_drain_timeout,
                linux_tracing_config=linux_tracing_config,
            )
            if step_name:
                self._recorder.record_non_session_step(
                    step_name=step_name,
                    tool="run_cmd",
                    result_summary={
                        "exit_code": result.returncode,
                        "stdout_head": result.stdout[:500],
                    },
                )
            return result

2. Wire SCENARIO_STEP_NAME into run_headless_core

Location: src/autoskillit/execution/headless.py

run_headless_core already receives step_name as a parameter but currently uses it only for logging/telemetry. It needs to inject SCENARIO_STEP_NAME=step_name into the subprocess environment dict passed to ctx.runner().

This is a small change: when building the env dict for the runner call, add:

run_env = {**(env or {})}
if step_name:
    run_env["SCENARIO_STEP_NAME"] = step_name

This env var is:

3. Recording invocation via Taskfile

Location: Taskfile.yml

Add a test-smoke-record task:

test-smoke-record:
  desc: Record smoke test as a scenario fixture
  deps: [_tmpdir-setup]
  env:
    SMOKE_TEST: "1"
    RECORD_SCENARIO: "1"
    RECORD_SCENARIO_DIR: "tests/fixtures/scenarios/smoke-happy"
  cmds:
    - ...  # same as test-smoke but with RECORD_SCENARIO env vars

4. Activation via environment variable

In run_headless_core or the make_context factory, check RECORD_SCENARIO=1:

  • If set, create a ScenarioRecorder and wrap DefaultSubprocessRunner in RecordingSubprocessRunner
  • RECORD_SCENARIO_DIR specifies output directory

Dependencies

Acceptance Criteria

  • RecordingSubprocessRunner implements the SubprocessRunner protocol
  • Each run_skill call produces a numbered cassette subdirectory loadable by Cassette.load()
  • Non-session steps recorded in manifest with result_summary
  • SCENARIO_STEP_NAME injected into subprocess env from run_headless_core
  • task test-smoke-record records a complete scenario to tests/fixtures/scenarios/smoke-happy/
  • scenario.json manifest written on finalize with correct step sequence
  • Existing task test-smoke unaffected (no env vars → no recording)

Metadata

Metadata

Assignees

No one assigned

    Labels

    recipe:implementationRoute: proceed directly to implementationstagedImplementation staged and waiting for promotion to main

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions