feat(docx-mcp): add read-only get_document_outline tool#527
feat(docx-mcp): add read-only get_document_outline tool#527stevenobiajulu wants to merge 1 commit into
Conversation
Agents working long documents (e.g. a 60-page MSA) currently have to read_file/grep through the whole body to locate a section, burning context window and risking edits to the wrong clause. get_document_outline projects the existing document-view heading detection into a compact structural map: one entry per heading carrying text, outline level, source, and the stable _bk_* paragraph_id, so an agent can read the cheap outline first and then scope a targeted read_file/replace_text to the right region. Style-based (Word HeadingN) headings only by default to keep the map low-noise; heuristic titles/run-in headers are opt-in via include_heuristic_headings. Output is JSON by default with an optional markdown rendering. DOCX sessions only (ODF has no equivalent heading projection yet). Read-only. Ref: #493
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Verdict: approve-with-nits — clean, well-scoped projection layer over existing No blocking findings. Nits (non-blocking):
|
LLM-Based Quality GateOverall: ✅ PASS (8 pass · 0 warn · 9 skipped · 17 total)
Full checklist questions
Estimated cost (this run): $0.0146 — 43,864 input + 515 output tokens (≈4 chars/token) on |
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
What
Adds a read-only
get_document_outlineMCP tool that returns a compact structural map of a document's headings instead of full text — a projection over the existingdocx-coredocument-view heading detection.Each outline entry carries the heading
text, outlinelevel, headingsource, and the stable_bk_*paragraph_id, so an agent can read the cheap outline first (hundreds of tokens), spot that §4.2 holds the indemnity clause, then scope a targetedread_file/replace_textto that region — instead of scanning a 60-page body and burning context window.HeadingN) headings only by default to keep the map low-noise; heuristic titles/run-in/centered-caps headers are opt-in viainclude_heuristic_headings.format: "json"(default; structuredoutlinearray plustotal_headings/total_paragraphs) orformat: "markdown"(indented ATX outline undercontent).Why
docx-corealready has the DOM traversal, heading-style detection, and stable_bk_*ids; this is a thin projection layer plus atool_catalog.tsentry. Map first, then targeted read.OpenSpec
New MCP capability, so this follows the OpenSpec workflow: change proposal
add-document-outline-tool(proposal + tasks +mcp-serverspec delta), validated withopenspec validate --strict.Files
packages/docx-mcp/src/tools/get_document_outline.ts(new projection)packages/docx-mcp/src/tool_catalog.ts,src/server.ts(catalog entry + dispatch)packages/docx-mcp/src/tools/get_document_outline.test.ts(traceability tests,TEST_FEATURE='add-document-outline-tool')packages/safe-docx-mcpb/manifest.json(canonical-tool contract)packages/docx-mcp/docs/tool-reference.generated.md(regenerated)openspec/changes/add-document-outline-tool/**Gates run (all green by exit code)
npm run build— passnpm run lint:workspaces— pass (0 errors; pre-existing warnings only, none in touched files)node vitest run(full docx-mcp suite) — 849/849 pass, incl. new 3 outline testsnpm run check:spec-coverage— pass (add-document-outline-tool: 3 scenarios covered by 3 story mappings)npm run check:tool-docs— pass (regenerated doc committed)npm run check:mcpb-manifest— passnpm run check:allure-labels/check:allure-filenames/check:allure-quality— passnpm run check:conformance-citations/check:conformance-doc— passopenspec validate add-document-outline-tool --strict— validReviewer notes
file_pathreuses the shared optional field whose description says "DOCX or ODT"; the tool itself is DOCX-only (made explicit in the tool description and spec). Left as-is to avoid a one-off field variant — flagging in case you want it tightened.Ref: #493