diff --git a/packages/software-factory/.agents/skills/boxel-development/references/dev-qunit-testing.md b/packages/software-factory/.agents/skills/boxel-development/references/dev-qunit-testing.md index 72671ed3fd..8ca7abca0d 100644 --- a/packages/software-factory/.agents/skills/boxel-development/references/dev-qunit-testing.md +++ b/packages/software-factory/.agents/skills/boxel-development/references/dev-qunit-testing.md @@ -134,7 +134,7 @@ Add `data-test-*` attributes to card templates for stable test selectors: When tests fail, the orchestrator feeds test failure details back to the agent. For more detail: -- **TestRun cards** live in the target realm's `Validations/` folder with a `test_` prefix (e.g., `Validations/test_issue-slug-1.json`). To find all test runs, search by the TestRun card type in the target realm. Each TestRun has a `sequenceNumber` that increases with each iteration. Use `read_file` on a specific TestRun for full details. +- **TestRun cards** live in the target realm's `Validations/` folder with a `test_` prefix (e.g., `Validations/test_issue-slug-1.json`). To find all test runs, run `Glob` over `Validations/test_*.json` or shell out via `Bash` to `boxel search --realm ` filtered on the TestRun card type. Each TestRun has a `sequenceNumber` that increases with each iteration. Use native `Read` on a specific TestRun for full details — paths are workspace-relative. ## Rules diff --git a/packages/software-factory/.agents/skills/boxel-development/references/dev-realm-search.md b/packages/software-factory/.agents/skills/boxel-development/references/dev-realm-search.md index e4b1ec7563..349278c2dc 100644 --- a/packages/software-factory/.agents/skills/boxel-development/references/dev-realm-search.md +++ b/packages/software-factory/.agents/skills/boxel-development/references/dev-realm-search.md @@ -1,6 +1,12 @@ # Realm Search Query Reference -How to use the `search_realm` tool to query cards in a realm. The query object follows the Boxel realm search API format. +How to construct queries for the Boxel realm search index. Run them from `Bash` against your **target realm** with: + +``` +boxel search --realm --query '' +``` + +The query JSON below is what goes into `--query`. Do not query other realms (base, software-factory, experiments, catalog) — the skills you've loaded are authoritative for patterns; cross-realm exploration burns tokens without helping. ## Basic Structure @@ -216,27 +222,18 @@ Descending order: ## Discovering Available Fields -You can only filter/sort on fields that exist on the card type. To find which fields a card type has: +You can only filter/sort on fields that exist on the card type. To find which fields a card type has, call the `get_card_schema` factory tool: -1. Use `run_command` to fetch the JSON schema for a card type: - -```json -{ - "command": "@cardstack/boxel-host/commands/get-card-type-schema/default", - "commandInput": { - "codeRef": { - "module": "http://localhost:4201/software-factory/darkfactory", - "name": "Issue" - } - } -} +``` +get_card_schema({ + module: 'http://localhost:4201/software-factory/darkfactory', + name: 'Issue' +}) ``` -2. The result contains `attributes.properties` listing all searchable fields (e.g., `status`, `summary`, `priority`). - -3. Use those field names in your `eq`, `contains`, `range`, or `sort` with the matching `on` type. +The result contains `schema.attributes.properties` listing all searchable fields (e.g., `status`, `summary`, `priority`) plus their types and any enum values. Use those field names in your `eq`, `contains`, `range`, or `sort` with the matching `on` type. -The card tools (`update_project`, `update_issue`, `create_knowledge`, `create_catalog_spec`) also have dynamic JSON schemas in their parameters that list available fields. +`get_card_schema` is also how you learn the shape for writing tracker (Project / Issue / KnowledgeArticle) and Spec card JSON files — call it before writing the JSON so what you write matches the live `CardDef`. ### Inheritance diff --git a/packages/software-factory/.agents/skills/boxel-development/references/dev-spec-usage.md b/packages/software-factory/.agents/skills/boxel-development/references/dev-spec-usage.md index 4f19b8ac33..08ccf42bda 100644 --- a/packages/software-factory/.agents/skills/boxel-development/references/dev-spec-usage.md +++ b/packages/software-factory/.agents/skills/boxel-development/references/dev-spec-usage.md @@ -1,24 +1,52 @@ # Catalog Spec Card Instances -For each top-level card definition, create a Catalog Spec card instance in the target realm's `Spec/` folder using the `create_catalog_spec` tool. This makes the card discoverable in the Boxel catalog. +For each top-level card definition, write a Catalog Spec card instance in the target realm's `Spec/` folder. This makes the card discoverable in the Boxel catalog. -The `create_catalog_spec` tool has the authoritative JSON schema for Spec card fields — use its parameter definitions to know which attributes and relationships are available. The tool auto-constructs the document with the correct `adoptsFrom` (`https://cardstack.com/base/spec#Spec`). +Specs adopt from `https://cardstack.com/base/spec#Spec` — that module lives in the base realm, not your target realm. Fetch the authoritative schema by calling the `get_card_schema` factory tool: -## Usage +``` +get_card_schema({ module: 'https://cardstack.com/base/spec', name: 'Spec' }) +``` -Use the `create_catalog_spec` tool to create a Spec card. The tool's parameters define the available fields dynamically from the card definition — consult the tool schema for the exact field names and types. +The result gives you the exact `attributes` and `relationships` shape. Write the JSON file with native `Write` (paths are workspace-relative, e.g. `Spec/sticky-note.json`); `boxel sync` pushes it to the realm between iterations. + +## Required Shape + +```json +{ + "data": { + "type": "card", + "attributes": { + "specType": "card", + "ref": { "module": "../sticky-note", "name": "StickyNote" }, + "readMe": "...", + "cardInfo": { "name": "Sticky Note", "summary": "..." } + }, + "relationships": { + "linkedExamples.0": { "links": { "self": "../StickyNote/welcome-note" } } + }, + "meta": { + "adoptsFrom": { + "module": "https://cardstack.com/base/spec", + "name": "Spec" + } + } + } +} +``` Key concepts: - `ref` — a CodeRef pointing to the card definition (module path + exported class name). The module path is relative from the Spec card to the `.gts` file (e.g., `../sticky-note` from `Spec/sticky-note.json`). - `specType` — `"card"` for CardDef, `"field"` for FieldDef, `"component"` for standalone components. -- `linkedExamples` — a relationship pointing to sample card instances. Create at least one sample instance and link it here. +- `linkedExamples` — a `linksToMany` relationship pointing to sample card instances. Use dotted keys (`linkedExamples.0`, `linkedExamples.1`, …) — the array form is rejected by the indexer. Create at least one sample instance and link it here. +- **Do NOT call `run_instantiate` on the Spec file itself.** Spec's module lives in the base realm; the prerender enforces same-origin module loads and the call always fails. To validate Specs, call `run_instantiate` WITHOUT a `path`; it discovers Specs in the target realm and exercises their `linkedExamples` against the card classes you wrote. ## Sample Card Instances Create at least one sample instance with realistic data for each top-level card. Sample instances serve as both catalog examples and test fixtures. -Place sample instances in a folder named after the card type (e.g., `StickyNote/welcome-note.json`). Use `write_file` to create them. The `linkedExamples` relationship in the Spec card points to these using a relative path (e.g., `../StickyNote/welcome-note`). +Place sample instances in a folder named after the card type (e.g., `StickyNote/welcome-note.json`) and write them with native `Write`. The `linkedExamples` relationship in the Spec card points to these using a relative path without the `.json` suffix (e.g., `../StickyNote/welcome-note`). --- diff --git a/packages/software-factory/src/factory-agent/claude-code.ts b/packages/software-factory/src/factory-agent/claude-code.ts index 1dd6067915..83bfd0ac87 100644 --- a/packages/software-factory/src/factory-agent/claude-code.ts +++ b/packages/software-factory/src/factory-agent/claude-code.ts @@ -66,11 +66,10 @@ const MAX_TOOL_USE_TURNS = 50; /** * Built-in Claude Code tools the factory exposes to the model on the - * Claude backend. These replace the custom `read_file` / `write_file` - * factory tools — they operate on the SDK query's `cwd` (the factory - * workspace), so the model uses native semantics for fs work and we - * keep MCP focused on operations that genuinely need realm runtime - * access (search_realm, validators, structured updates, signals). + * Claude backend. They operate on the SDK query's `cwd` (the factory + * workspace), so the model handles workspace files natively while MCP + * stays focused on what needs realm runtime access (`get_card_schema`, + * validators, control signals). */ const NATIVE_FS_TOOLS = ['Read', 'Write', 'Edit', 'Bash', 'Glob', 'Grep']; @@ -298,14 +297,12 @@ export class ClaudeCodeFactoryAgent implements LoopAgent { // Two tool surfaces are visible to the model on the Claude backend: // 1. Native Claude Code tools (Read / Write / Edit / Bash / Glob / // Grep) — anchored to the factory workspace via the SDK query's - // `cwd`. These replace the factory's old `read_file` / - // `write_file` shims; the model works on the local mirror of the - // target realm directly. + // `cwd`. The model works on the local mirror of the target realm + // directly; `boxel sync` pushes between iterations. // 2. Factory tools exposed via an in-process MCP server, prefixed - // with `mcp____`. Used for everything that needs realm - // runtime access (search, validators, host commands, structured - // updates) and for control signals (signal_done / - // request_clarification). + // with `mcp____`. Used for realm-runtime operations + // (`get_card_schema`, the five validators) and for control + // signals (`signal_done`, `request_clarification`). // // The shared prompt template / skills reference factory operations by // their plain names (e.g. `signal_done`). Append a short rename map diff --git a/packages/software-factory/src/factory-agent/index.ts b/packages/software-factory/src/factory-agent/index.ts index df5ab728ba..41e5f432b0 100644 --- a/packages/software-factory/src/factory-agent/index.ts +++ b/packages/software-factory/src/factory-agent/index.ts @@ -16,4 +16,4 @@ export * from './types'; export { OpencodeFactoryAgent } from './opencode'; export type { OpencodeAgentConfig } from './opencode'; export { ClaudeCodeFactoryAgent } from './claude-code'; -export { MockFactoryAgent, MockLoopAgent } from './mocks'; +export { MockLoopAgent } from './mocks'; diff --git a/packages/software-factory/src/factory-agent/mocks.ts b/packages/software-factory/src/factory-agent/mocks.ts index 68a8ec3271..4c0395db52 100644 --- a/packages/software-factory/src/factory-agent/mocks.ts +++ b/packages/software-factory/src/factory-agent/mocks.ts @@ -1,51 +1,14 @@ /** * Mock agent implementations for testing. * - * These are deterministic agents that return pre-scripted responses, - * used by unit tests and smoke tests to verify orchestration logic - * without calling a real LLM. + * Deterministic agents that return pre-scripted responses, used by unit + * tests and smoke tests to verify orchestration logic without calling + * a real LLM. */ -import type { AgentAction, AgentContext, FactoryAgent } from './types'; -import type { LoopAgent, AgentRunResult } from './types'; +import type { AgentContext, LoopAgent, AgentRunResult } from './types'; import type { FactoryTool } from '../factory-tool-builder'; -// --------------------------------------------------------------------------- -// MockFactoryAgent — deterministic FactoryAgent for declarative model tests -// --------------------------------------------------------------------------- - -export class MockFactoryAgent implements FactoryAgent { - private responses: AgentAction[][]; - private callIndex = 0; - - /** All AgentContext inputs received, in order. */ - readonly receivedContexts: AgentContext[] = []; - - constructor(responses: AgentAction[][]) { - this.responses = responses; - } - - async plan(context: AgentContext): Promise { - this.receivedContexts.push(context); - - if (this.callIndex >= this.responses.length) { - throw new Error( - `MockFactoryAgent exhausted: called ${this.callIndex + 1} times ` + - `but only ${this.responses.length} response(s) were configured`, - ); - } - - let response = this.responses[this.callIndex]; - this.callIndex++; - return response; - } - - /** Number of times plan() has been called. */ - get callCount(): number { - return this.callIndex; - } -} - // --------------------------------------------------------------------------- // MockLoopAgent — deterministic LoopAgent for tool-use model tests // --------------------------------------------------------------------------- diff --git a/packages/software-factory/src/factory-agent/types.ts b/packages/software-factory/src/factory-agent/types.ts index b9f011d580..96dcf1d764 100644 --- a/packages/software-factory/src/factory-agent/types.ts +++ b/packages/software-factory/src/factory-agent/types.ts @@ -1,9 +1,10 @@ /** * Shared types, interfaces, and constants for the factory agent system. * - * This module contains all the data types used across the declarative agent - * (factory-agent.ts), the tool-use agent (factory-agent-tool-use.ts), and - * their consumers (loop, context builder, prompt loader, etc.). + * The runtime agents (`ClaudeCodeFactoryAgent` in `claude-code.ts`, + * `OpencodeFactoryAgent` in `opencode.ts`) implement the `LoopAgent` + * interface declared here; orchestration consumers (issue loop, context + * builder, prompt loader) share the data types declared below. */ // --------------------------------------------------------------------------- @@ -17,7 +18,7 @@ * Pinned to `claude-opus-4-7` rather than the unversioned `claude-opus-4` * alias. The alias route exhibited a deterministic mid-stream truncation * on large tool-call arguments (`finish_reason: null`, `completion=1`) - * that broke every `write_file` for full `.gts` card definitions. Opus + * that broke every native `Write` for full `.gts` card definitions. Opus * 4.7 on the pinned route returned clean `finish_reason: tool_calls` * responses with completions up to ~4.7K tokens in a single turn, and * ran an end-to-end factory loop to `outcome=all_issues_done` with no @@ -54,14 +55,6 @@ export const VALID_ACTION_TYPES = [ export const VALID_REALMS = ['target', 'test'] as const; -// Action types that require path + content -export const FILE_ACTION_TYPES: ReadonlySet = new Set([ - 'create_file', - 'update_file', - 'create_test', - 'update_test', -]); - // --------------------------------------------------------------------------- // Types // --------------------------------------------------------------------------- @@ -93,8 +86,9 @@ export interface ClaudeCodeAgentConfig { * query's `cwd` so the model's native Read / Write / Edit / Bash / Glob / * Grep tools operate against the factory workspace by default — paths like * `sticky-note.gts` resolve inside the workspace, with no surprise hits - * against the user's filesystem. Realm I/O still goes through factory - * MCP tools (search_realm, run_command, validators, …). + * against the user's filesystem. Realm-runtime operations go through the + * factory MCP tools (get_card_schema, run_lint / run_parse / run_evaluate + * / run_instantiate / run_tests, signal_done, request_clarification). */ workspaceDir?: string; } @@ -259,14 +253,6 @@ export interface AgentAction { toolArgs?: Record; } -// --------------------------------------------------------------------------- -// FactoryAgent interface (declarative model) -// --------------------------------------------------------------------------- - -export interface FactoryAgent { - plan(context: AgentContext): Promise; -} - // --------------------------------------------------------------------------- // Message types (for LLM communication) // ---------------------------------------------------------------------------