feat(docx-mcp): add opt-in fingerprint ordinal disambiguation to read_file#525
feat(docx-mcp): add opt-in fingerprint ordinal disambiguation to read_file#525stevenobiajulu wants to merge 1 commit into
Conversation
…_file
Downstream citation/ingest pipelines that reference a specific paragraph by
content_fingerprint cannot disambiguate duplicate normalized text without
re-implementing document-order grouping. This adds an additive, opt-in
include_fingerprint_ordinal flag on read_file(format="json") that, together
with include_fingerprint, attaches content_fingerprint_ordinal (1-based
document-order position among paragraphs sharing a fingerprint),
content_fingerprint_count_in_document (document-wide total, stable under
pagination), and the convenience composite portable_paragraph_ref
("<fingerprint>#<ordinal>").
Ordinals and counts are computed over the full document so a paginated or
node_ids-filtered read still reports stable, document-wide values. The flag
no-ops without include_fingerprint, on TOON/simple output, and on Google
Docs/ODT sessions. `_bk_*` ids, edit-anchor semantics, and the fingerprint
algorithm are unchanged; the ordinal is a read-only disambiguator, never an
edit anchor.
Ref: #205
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Review verdict: APPROVE (with nits)Reviewed the diff, spec delta, tests, and issue #205 scope. This is a clean, well-scoped, additive change that matches the (re-scoped, peer-reviewed) ask in #205 exactly. No blocking findings. What I verified
Nits (non-blocking)
|
LLM-Based Quality GateOverall: ✅ PASS (8 pass · 0 warn · 9 skipped · 17 total)
Full checklist questions
Estimated cost (this run): $0.0215 — 67,226 input + 509 output tokens (≈4 chars/token) on |
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
Summary
Adds an opt-in, additive duplicate-disambiguation surface to
read_file(format="json")for DOCX sessions, on top of the existingcontent_fingerprint. Closes the remaining gap from #205: when the samenormalized paragraph text appears multiple times in one document, every
occurrence shares one fingerprint, so consumers can't reference "the second
occurrence of WHEREAS …" without re-implementing document-order grouping.
When
include_fingerprint=trueand the newinclude_fingerprint_ordinal=trueare set with
format="json", each paragraph node gains:content_fingerprint_ordinal— 1-based document-order position among paragraphs sharing the samecontent_fingerprintcontent_fingerprint_count_in_document— total paragraphs sharing that fingerprint (document-wide, stable under pagination)portable_paragraph_ref— convenience composite"<content_fingerprint>#<ordinal>"Ordinals/counts are computed over the full document in document order, so a
paginated or
node_ids-filtered read still reports stable, document-wide values.Out of scope / unchanged (per #205)
_bk_*ids, edit-anchor semantics, and thecontent_fingerprintalgorithm are untouched.include_fingerprint, on TOON/simple output, or on Google Docs / ODT sessions.OpenSpec
New capability → added change proposal
add-read-file-fingerprint-ordinal(
openspec validate --strictpasses) with anmcp-serverADDED requirement and10 scenarios, all covered by the new traceability test.
Files
openspec/changes/add-read-file-fingerprint-ordinal/**— proposal, tasks, mcp-server spec deltapackages/docx-mcp/src/tools/read_file.ts— compute + attach ordinal/count/portable_paragraph_refpackages/docx-mcp/src/tool_catalog.ts—include_fingerprint_ordinalschema fieldpackages/docx-mcp/src/tools/read_file_fingerprint_ordinal.test.ts— new traceability test (TEST_FEATURE=add-read-file-fingerprint-ordinal)docs/tool-reference.generated.mdandSAFE_DOCX_OPENSPEC_TRACEABILITY.mdGates run (all green by exit code)
npm run buildnpm run lint:workspaces(0 errors; pre-existing warnings only)npm run check:spec-coverage -w @usejunior/docx-mcp -- --strict→ PASS add-read-file-fingerprint-ordinal: 10/10read_file*(88 passed) incl. new + existing fingerprint suites (19 passed)npm run check:tool-docs(after commit)npm run check:conformance-citations,npm run check:conformance-docnpm run check:allure-labels,check:allure-filenames,check:allure-qualityReviewer notes
Ref: #205(notCloses) so issue closure stays under maintainer control.