Skip to content

Add Crossref no-auth chat tools app#7486

Open
liangtovi-debug wants to merge 2 commits into
BasedHardware:mainfrom
liangtovi-debug:feat/crossref-chat-tools-3120
Open

Add Crossref no-auth chat tools app#7486
liangtovi-debug wants to merge 2 commits into
BasedHardware:mainfrom
liangtovi-debug:feat/crossref-chat-tools-3120

Conversation

@liangtovi-debug
Copy link
Copy Markdown

/claim #3120

Adds a standalone no-auth Crossref integration app under plugins/omi-crossref-app.

Included chat tools:

  • search_crossref_works (keyword search, 1-10 results)
  • get_crossref_work (DOI lookup with basic DOI format validation)
  • get_crossref_works_by_author (recent-author works, 1-10 results)

Implementation notes:

  • Uses async HTTP via httpx.AsyncClient in FastAPI handlers (no blocking requests in async paths).
  • Includes defensive input validation and API error handling.
  • Includes deploy files (Procfile, railway.toml) and README for local/deploy usage.

Validation run:

  • python3 -m py_compile plugins/omi-crossref-app/main.py plugins/omi-crossref-app/models.py

If accepted as one of the #3120 integration-app bounties, I can follow the payout flow after merge.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 25, 2026

Greptile Summary

This PR adds a new standalone no-auth Crossref integration plugin under plugins/omi-crossref-app that exposes three chat tools for searching scholarly works, looking up a DOI, and listing works by author.

  • The three @app.post tool endpoints bind parameters with FastAPI Query(...), meaning FastAPI reads them from the URL query string. The Omi platform delivers tool parameters in the POST JSON body (as every other plugin in this repo demonstrates), so all tool calls will receive 422 errors and never execute.
  • ChatToolResponse declares a message: str field, while the Omi-standard model uses result: Optional[str] / error: Optional[str]; the response payload won't be parsed correctly by the client.
  • The date-parts extraction (date-parts[0][0]) sits outside the try/except block and raises an unhandled IndexError on any Crossref item that returns an empty outer or inner list — a real edge case in the API.
  • The DOI value is interpolated unsanitised into the HTTP path; a crafted value such as foo/../funders/10.13039 causes httpx to resolve to a completely different Crossref endpoint.

Confidence Score: 2/5

Not safe to merge — the tool endpoints will fail on every real Omi call, the response model is mismatched, and there is an unhandled crash path plus a path-traversal vector in the DOI lookup.

The three core tool handlers will never execute successfully when called by the Omi platform: parameters are expected in the JSON body but are declared as URL query strings, so FastAPI rejects every request with 422. Even if that were fixed, the ChatToolResponse schema uses message rather than the result/error fields the client parses. The date extraction code has an uncaught IndexError on real-world Crossref data, and the DOI handler allows path traversal to unintended Crossref endpoints. Deployment files and README are clean, but the two key Python files need substantial rework before the integration can function.

main.py and models.py both need rework before any tool call can succeed.

Security Review

  • Path traversal (SSRF within Crossref domain)plugins/omi-crossref-app/main.py line 117: the caller-supplied doi parameter is inserted verbatim into the URL path (f"/works/{normalized}"). A value containing ../ sequences (e.g., foo/../funders/...) causes httpx to resolve to an unintended Crossref API endpoint. While the impact is scoped to api.crossref.org, it allows callers to reach arbitrary Crossref endpoints beyond /works/{doi}.

Important Files Changed

Filename Overview
plugins/omi-crossref-app/main.py Core FastAPI app with four bugs: POST tool endpoints use Query params instead of JSON body (won't work with Omi), unhandled IndexError in date extraction, unsanitised DOI path interpolation, and non-standard tools manifest URL.
plugins/omi-crossref-app/models.py ChatToolResponse defines only a message field; the Omi platform expects result and error fields, so responses will be unrecognised by the client.
plugins/omi-crossref-app/requirements.txt Pinned dependencies (fastapi, uvicorn, httpx, pydantic) with no obvious issues.
plugins/omi-crossref-app/Procfile Standard Procfile for Heroku/Railway deployment; no issues.
plugins/omi-crossref-app/railway.toml Standard Railway deployment config using Nixpacks; no issues.
plugins/omi-crossref-app/README.md Minimal README covering local run and deploy; accurate given the current implementation.

Reviews (1): Last reviewed commit: "Add Crossref no-auth chat tools plugin" | Re-trigger Greptile

Comment thread plugins/omi-crossref-app/main.py Outdated
Comment on lines +83 to +95
@app.post("/tools/search_crossref_works", response_model=ChatToolResponse)
async def search_crossref_works(
query: str = Query(..., min_length=2),
max_results: int = Query(5, ge=1, le=10),
):
limited = clamp_max_results(max_results)
try:
payload = await crossref_get(
"/works",
{"query": query, "rows": limited, "sort": "relevance", "order": "desc"},
)
except Exception as exc:
return ChatToolResponse(message=f"Crossref request failed: {exc}")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 POST tool endpoints read from query string, not request body

All three tool endpoints are declared as @app.post(...) but bind parameters with Query(...), so FastAPI reads them from the URL query string. The Omi platform calls chat-tool endpoints with a POST request and a JSON body (e.g., {"query": "...", "max_results": 5}). Every request will return a 422 Unprocessable Entity because the required parameters are absent from the query string. The established pattern in this repo (see omi-twitter-chat-tools-app/main.py) is body = await request.json() with manual extraction — or a Pydantic body model — not Query.

Comment thread plugins/omi-crossref-app/models.py Outdated
Comment on lines +4 to +5
class ChatToolResponse(BaseModel):
message: str
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 ChatToolResponse schema diverges from the Omi contract

Every other Omi chat-tools app in this repo (Twitter, GitHub, etc.) defines ChatToolResponse with result: Optional[str] and error: Optional[str] fields. This model exposes only a message: str field, so the Omi client will receive a payload whose field name it does not recognise. Success and error conditions are also collapsed into the same field, making it impossible for the caller to distinguish between them programmatically.

Comment thread plugins/omi-crossref-app/main.py Outdated
Comment on lines +100 to +107
lines = [f"Top {len(items)} Crossref results for '{query}':"]
for idx, item in enumerate(items, 1):
title = clean((item.get("title") or ["Untitled"])[0])
doi = clean(item.get("DOI"))
year = clean(((item.get("published-print") or item.get("published-online") or {}).get("date-parts", [[""]]))[0][0])
lines.append(f"{idx}. {title} ({year})")
lines.append(f" DOI: {doi}")
return ChatToolResponse(message="\n".join(lines))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Unhandled IndexError in date-parts extraction

The year extraction (...get("date-parts", [[""]]))[0][0] is placed outside any try/except block (the try only wraps the HTTP call). If Crossref returns "date-parts": [] (an empty outer list) or "date-parts": [[]] (an empty inner list), Python raises an IndexError that propagates as an unhandled 500 response. The same pattern is repeated identically at lines 126 and 166.

Comment thread plugins/omi-crossref-app/main.py Outdated
return ChatToolResponse(message="Invalid DOI format. Example: 10.1038/nphys1170")

try:
payload = await crossref_get(f"/works/{normalized}", {})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 security Unsanitised DOI inserted into URL path allows path traversal within Crossref API

normalized is inserted directly into f"/works/{normalized}". If a caller passes anything/../funders/10.13039/501100000780, httpx normalises the path and the request resolves to https://api.crossref.org/funders/10.13039/501100000780 — a completely different Crossref endpoint. The only guard ("/" not in normalized) actually makes this worse: it blocks valid single-segment inputs but allows any value that already contains a slash, including ../../members. The fix is to URL-encode the DOI component before interpolation (e.g., urllib.parse.quote(normalized, safe="")), or use a dedicated path-parameter encoding step.

Comment on lines +41 to +80
@app.get("/tools")
async def tools():
return {
"tools": [
{
"name": "search_crossref_works",
"description": "Search scholarly works by keyword via Crossref",
"parameters": {
"query": {"type": "string", "description": "Search keyword(s)"},
"max_results": {
"type": "integer",
"description": "Number of results (1-10)",
"default": 5,
},
},
},
{
"name": "get_crossref_work",
"description": "Get details for a specific work by DOI",
"parameters": {
"doi": {
"type": "string",
"description": "DOI, e.g. 10.1038/nphys1170",
}
},
},
{
"name": "get_crossref_works_by_author",
"description": "Find recent works for an author name",
"parameters": {
"author": {"type": "string", "description": "Author name"},
"max_results": {
"type": "integer",
"description": "Number of results (1-10)",
"default": 5,
},
},
},
]
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Tools manifest at /tools deviates from the Omi convention

The existing omi-twitter-chat-tools-app exposes its manifest at GET /.well-known/omi-tools.json, which matches the Omi documentation for chat-tool discovery. Using a non-standard path (/tools) means the Omi backend may not be able to discover or register these tools automatically.

Comment on lines +19 to +20
def clamp_max_results(value: int) -> int:
return max(1, min(10, value))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Redundant clamp_max_results helper

max_results is already constrained by FastAPI's Query(5, ge=1, le=10), so clamp_max_results is dead code — it can never receive a value outside 1–10 by the time it's called.

Suggested change
def clamp_max_results(value: int) -> int:
return max(1, min(10, value))
def clamp_max_results(value: int) -> int:
# FastAPI Query(ge=1, le=10) already enforces bounds; kept as a safety no-op.
return max(1, min(10, value))

@liangtovi-debug
Copy link
Copy Markdown
Author

Addressed the Greptile findings in commit 3fd644d:

  1. Switched all tool endpoints to accept JSON request bodies via Pydantic input models (instead of query params), matching Omi POST tool-call behavior.
  2. Updated ChatToolResponse to Omi-standard result / error fields.
  3. Added resilient extract_year() logic to avoid IndexError on sparse/empty date-parts.
  4. Hardened DOI lookup by URL-encoding DOI path segments and rejecting .. traversal patterns.
  5. Added /.well-known/omi-tools.json manifest endpoint with endpoint/method/parameter metadata.

Validation run: python3 -m py_compile plugins/omi-crossref-app/main.py plugins/omi-crossref-app/models.py (pass).

Please re-run review on latest head.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant