diff --git a/.github/llm-based-quality-gate/checklist.md b/.github/llm-based-quality-gate/checklist.md index e8794da..0dd8994 100644 --- a/.github/llm-based-quality-gate/checklist.md +++ b/.github/llm-based-quality-gate/checklist.md @@ -25,6 +25,6 @@ Author guidance: - [ ] **Canonical-emission surface completeness**: If this PR adds or changes a tracked-edit surface in `packages/docx-core/src/primitives/**` or `packages/docx-mcp/src/tools/**`, are the paired artifacts updated together — `packages/docx-core/src/integration/canonical-emission-regression.test.ts`, `packages/docx-mcp/src/integration/canonical-emission-mcp.test.ts`, and the documented emitter surface (Table A) — or is the rollout only partially wired? The primitive change looks done before the MCP path, regression matrix, and documented contract are wired through; partial rollouts ship undocumented surface that drifts. References: #172 (`RevisionContext` threaded through every Table A MCP tool), #175 (24-test regression suite + verified write-time emitter rows), #215 (re-enabled `rPrChange` regression + updated support surface for `replaceParagraphTextRange`). - [ ] **Lean predicate drift against engine semantics (asymmetric)**: If this PR changes field-wrapper semantics, the proof boundary, or atomizer behavior — `packages/docx-core/src/baselines/atomizer/**`, `verification/lean/LeanSpike/Spec.lean`, `verification/lean/Tier2/**`, or `packages/docx-core/src/integration/lean-spec-bridge.test.ts` — and if the TS engine semantics shifted, did the PR also update the Lean residual predicate and bridge tests, or is the proof now pinned to a stale stronger/weaker assumption? Asymmetric: a TS change without a corresponding Lean update is WARN; a Lean-only change without a TS update should not fire. The Lean side can still compile while the abstraction boundary is subtly wrong for the next engine refactor. References: #208 (closed `inv_field_001` using stronger `recursivelyWellformed`), #220 (weakened the axiom to document-level `preservationFriendly` to avoid breakage when field fragmentation lands). - [ ] **Unit-test quality (avoid tautological / change-detector tests)**: If this PR adds or modifies any `**/*.test.ts` (or other test files), are the test assertions independent of the system under test — expected values constructed from first principles rather than re-derived from the function under test, mocks limited to external boundaries (filesystem, network, clocks) rather than mocking the SUT itself, assertions making concrete semantic claims rather than just snapshotting current behavior or asserting non-null, and any test added alongside a bug fix actually exercising the bug? Tests that re-implement the production code as the "expected" value, or mock out the system under test, pass green while providing no regression protection. -- [ ] **Re-derived facts vs canonical sources**: If this PR adds logic that re-computes a fact the codebase already derives elsewhere — another fldChar begin/separate/end visible-content walk (canonical: getParagraphRuns in packages/docx-core/src/primitives/text.ts), a second footnote display-number map (buildFootnoteDisplayMap vs getFootnotes().displayNumber), or any sibling re-implementation of an existing helper instead of importing or exporting it — does the PR consume the canonical source, or explicitly justify the new derivation and add a test pinning the copies in agreement? Independent derivations of the same fact drift apart; #382's doubled footnote markers were two passes disagreeing about one fact, and the fix initially added a fifth copy of the field-state walk. References: #382, #386, #393. +- [ ] **Re-derived facts vs canonical sources**: If this PR adds logic that re-computes a fact the codebase already derives elsewhere — another fldChar begin/separate/end visible-content walk (canonical: getParagraphRuns in packages/docx-core/src/primitives/text.ts), a second pass computing which footnotes a paragraph visibly references and their display numbers (canonical: DocumentViewNode.footnote_refs, the single derivation in getFootnoteMarkersForParagraph in packages/docx-core/src/primitives/document_view.ts — consumers like read_file's clean_text suffix read this field rather than re-walking the DOM), or any sibling re-implementation of an existing helper instead of importing or exporting it — does the PR consume the canonical source, or explicitly justify the new derivation and add a test pinning the copies in agreement? Independent derivations of the same fact drift apart; #382's doubled footnote markers were two passes disagreeing about one fact, and the fix initially added a fifth copy of the field-state walk before #393 consolidated the footnote-reference derivation behind DocumentViewNode.footnote_refs. References: #382, #386, #393. - [ ] **`.openspec` tag ↔ test-assertion drift**: If this PR adds, moves, or changes any `.openspec('[ID] …')` tag on a `test(...)`/`it(...)`, does the tagged test body actually exercise that scenario's GIVEN/WHEN/THEN — establishing the scenario's precondition and asserting its THEN observable — or is the tag attached without a covering assertion (tag-stuffing)? Judge each added/changed `[ID]` independently against the body of the test it sits on. Pay special attention to (a) tags moved onto generic clean/round-trip/property tests that never set up the scenario's specific precondition, (b) a single test absorbing several distinct scenario IDs it does not separately assert, and (c) "empirical proxy" / repo-convention justifications standing in for a real covering assertion. Deterministic coverage checks confirm the tag exists and (with the THEN-keyword guard) that a scenario token is mentioned; they cannot judge whether the assertion actually covers the scenario — this asymmetric, oversells-coverage drift is exactly what the gate is for. References: #513 (first pass tag-stuffed multiple docx-comparison scenarios onto tests that didn't assert them; only peer review caught it), #469. - [ ] **Library stays general (no downstream-domain leakage)**: If this PR adds or renames public API, types, recipes, functions, or identifiers in `packages/*/src/**`, does it avoid baking in names or concepts specific to a single downstream consumer's domain — agreements especially (`signature`, `signatory`, `party`, `coverTerms`, print-name/title/date signing semantics, or other agreement field names)? safe-docx is a general OOXML library: its surface should be Word/OOXML vocabulary (table, row, cell, border, run) plus a small allowlist of genuinely LLM-driven affordances that are not in Word but exist because LLMs are the primary consumers (e.g. an outline/TOC view sized for a context window). Domain-shaped composition (a "signature block" is just a `w:tbl` built from `TableSpec`/`BorderSpec`/`RunProps`) belongs in the consumer, not here. WARN if a new symbol is named for, or shaped around, one product rather than the general OOXML/LLM surface. See CONTRIBUTING "Library scope & domain boundaries". Reference: `remove-agreement-domain-recipes` removed `coverTermsTable`/`signatureBlock` for this reason.