Add Crossref no-auth chat tools app#7486
Conversation
Greptile SummaryThis PR adds a new standalone no-auth Crossref integration plugin under
Confidence Score: 2/5Not safe to merge — the tool endpoints will fail on every real Omi call, the response model is mismatched, and there is an unhandled crash path plus a path-traversal vector in the DOI lookup. The three core tool handlers will never execute successfully when called by the Omi platform: parameters are expected in the JSON body but are declared as URL query strings, so FastAPI rejects every request with 422. Even if that were fixed, the ChatToolResponse schema uses
|
| Filename | Overview |
|---|---|
| plugins/omi-crossref-app/main.py | Core FastAPI app with four bugs: POST tool endpoints use Query params instead of JSON body (won't work with Omi), unhandled IndexError in date extraction, unsanitised DOI path interpolation, and non-standard tools manifest URL. |
| plugins/omi-crossref-app/models.py | ChatToolResponse defines only a message field; the Omi platform expects result and error fields, so responses will be unrecognised by the client. |
| plugins/omi-crossref-app/requirements.txt | Pinned dependencies (fastapi, uvicorn, httpx, pydantic) with no obvious issues. |
| plugins/omi-crossref-app/Procfile | Standard Procfile for Heroku/Railway deployment; no issues. |
| plugins/omi-crossref-app/railway.toml | Standard Railway deployment config using Nixpacks; no issues. |
| plugins/omi-crossref-app/README.md | Minimal README covering local run and deploy; accurate given the current implementation. |
Reviews (1): Last reviewed commit: "Add Crossref no-auth chat tools plugin" | Re-trigger Greptile
| @app.post("/tools/search_crossref_works", response_model=ChatToolResponse) | ||
| async def search_crossref_works( | ||
| query: str = Query(..., min_length=2), | ||
| max_results: int = Query(5, ge=1, le=10), | ||
| ): | ||
| limited = clamp_max_results(max_results) | ||
| try: | ||
| payload = await crossref_get( | ||
| "/works", | ||
| {"query": query, "rows": limited, "sort": "relevance", "order": "desc"}, | ||
| ) | ||
| except Exception as exc: | ||
| return ChatToolResponse(message=f"Crossref request failed: {exc}") |
There was a problem hiding this comment.
POST tool endpoints read from query string, not request body
All three tool endpoints are declared as @app.post(...) but bind parameters with Query(...), so FastAPI reads them from the URL query string. The Omi platform calls chat-tool endpoints with a POST request and a JSON body (e.g., {"query": "...", "max_results": 5}). Every request will return a 422 Unprocessable Entity because the required parameters are absent from the query string. The established pattern in this repo (see omi-twitter-chat-tools-app/main.py) is body = await request.json() with manual extraction — or a Pydantic body model — not Query.
| class ChatToolResponse(BaseModel): | ||
| message: str |
There was a problem hiding this comment.
ChatToolResponse schema diverges from the Omi contract
Every other Omi chat-tools app in this repo (Twitter, GitHub, etc.) defines ChatToolResponse with result: Optional[str] and error: Optional[str] fields. This model exposes only a message: str field, so the Omi client will receive a payload whose field name it does not recognise. Success and error conditions are also collapsed into the same field, making it impossible for the caller to distinguish between them programmatically.
| lines = [f"Top {len(items)} Crossref results for '{query}':"] | ||
| for idx, item in enumerate(items, 1): | ||
| title = clean((item.get("title") or ["Untitled"])[0]) | ||
| doi = clean(item.get("DOI")) | ||
| year = clean(((item.get("published-print") or item.get("published-online") or {}).get("date-parts", [[""]]))[0][0]) | ||
| lines.append(f"{idx}. {title} ({year})") | ||
| lines.append(f" DOI: {doi}") | ||
| return ChatToolResponse(message="\n".join(lines)) |
There was a problem hiding this comment.
Unhandled
IndexError in date-parts extraction
The year extraction (...get("date-parts", [[""]]))[0][0] is placed outside any try/except block (the try only wraps the HTTP call). If Crossref returns "date-parts": [] (an empty outer list) or "date-parts": [[]] (an empty inner list), Python raises an IndexError that propagates as an unhandled 500 response. The same pattern is repeated identically at lines 126 and 166.
| return ChatToolResponse(message="Invalid DOI format. Example: 10.1038/nphys1170") | ||
|
|
||
| try: | ||
| payload = await crossref_get(f"/works/{normalized}", {}) |
There was a problem hiding this comment.
Unsanitised DOI inserted into URL path allows path traversal within Crossref API
normalized is inserted directly into f"/works/{normalized}". If a caller passes anything/../funders/10.13039/501100000780, httpx normalises the path and the request resolves to https://api.crossref.org/funders/10.13039/501100000780 — a completely different Crossref endpoint. The only guard ("/" not in normalized) actually makes this worse: it blocks valid single-segment inputs but allows any value that already contains a slash, including ../../members. The fix is to URL-encode the DOI component before interpolation (e.g., urllib.parse.quote(normalized, safe="")), or use a dedicated path-parameter encoding step.
| @app.get("/tools") | ||
| async def tools(): | ||
| return { | ||
| "tools": [ | ||
| { | ||
| "name": "search_crossref_works", | ||
| "description": "Search scholarly works by keyword via Crossref", | ||
| "parameters": { | ||
| "query": {"type": "string", "description": "Search keyword(s)"}, | ||
| "max_results": { | ||
| "type": "integer", | ||
| "description": "Number of results (1-10)", | ||
| "default": 5, | ||
| }, | ||
| }, | ||
| }, | ||
| { | ||
| "name": "get_crossref_work", | ||
| "description": "Get details for a specific work by DOI", | ||
| "parameters": { | ||
| "doi": { | ||
| "type": "string", | ||
| "description": "DOI, e.g. 10.1038/nphys1170", | ||
| } | ||
| }, | ||
| }, | ||
| { | ||
| "name": "get_crossref_works_by_author", | ||
| "description": "Find recent works for an author name", | ||
| "parameters": { | ||
| "author": {"type": "string", "description": "Author name"}, | ||
| "max_results": { | ||
| "type": "integer", | ||
| "description": "Number of results (1-10)", | ||
| "default": 5, | ||
| }, | ||
| }, | ||
| }, | ||
| ] | ||
| } |
There was a problem hiding this comment.
Tools manifest at
/tools deviates from the Omi convention
The existing omi-twitter-chat-tools-app exposes its manifest at GET /.well-known/omi-tools.json, which matches the Omi documentation for chat-tool discovery. Using a non-standard path (/tools) means the Omi backend may not be able to discover or register these tools automatically.
| def clamp_max_results(value: int) -> int: | ||
| return max(1, min(10, value)) |
There was a problem hiding this comment.
Redundant
clamp_max_results helper
max_results is already constrained by FastAPI's Query(5, ge=1, le=10), so clamp_max_results is dead code — it can never receive a value outside 1–10 by the time it's called.
| def clamp_max_results(value: int) -> int: | |
| return max(1, min(10, value)) | |
| def clamp_max_results(value: int) -> int: | |
| # FastAPI Query(ge=1, le=10) already enforces bounds; kept as a safety no-op. | |
| return max(1, min(10, value)) |
|
Addressed the Greptile findings in commit 3fd644d:
Validation run: Please re-run review on latest head. |
/claim #3120
Adds a standalone no-auth Crossref integration app under
plugins/omi-crossref-app.Included chat tools:
search_crossref_works(keyword search, 1-10 results)get_crossref_work(DOI lookup with basic DOI format validation)get_crossref_works_by_author(recent-author works, 1-10 results)Implementation notes:
httpx.AsyncClientin FastAPI handlers (no blockingrequestsin async paths).Procfile,railway.toml) and README for local/deploy usage.Validation run:
python3 -m py_compile plugins/omi-crossref-app/main.py plugins/omi-crossref-app/models.pyIf accepted as one of the #3120 integration-app bounties, I can follow the payout flow after merge.