Skip to content

Add PubMed no-auth chat tools app#7485

Open
liangtovi-debug wants to merge 3 commits into
BasedHardware:mainfrom
liangtovi-debug:feat/pubmed-chat-tools-3120
Open

Add PubMed no-auth chat tools app#7485
liangtovi-debug wants to merge 3 commits into
BasedHardware:mainfrom
liangtovi-debug:feat/pubmed-chat-tools-3120

Conversation

@liangtovi-debug
Copy link
Copy Markdown

@liangtovi-debug liangtovi-debug commented May 24, 2026

Summary

Adds a new no-auth Omi integration app: plugins/omi-pubmed-app.

This app provides practical PubMed chat tools using NCBI E-utilities:

  • search_pubmed (query -> top PMIDs with citations)
  • get_pubmed_article (PMID -> citation + abstract)
  • get_related_pubmed (PMID -> related articles)

Why this fits #3120

  • New integration app bounty candidate
  • No OAuth/API key required
  • Includes manifest, tool endpoints, deploy files, and docs
  • Different scope from recent create new app bounties #3120 examples (HN/Wikipedia/StackOverflow/arXiv/weather/currency/holidays/food/earthquake)

Validation

  • python3 -m py_compile plugins/omi-pubmed-app/main.py plugins/omi-pubmed-app/models.py
  • Live API smoke checks against NCBI E-utilities:
    • esearch returns PMIDs for machine learning
    • esummary returns metadata for returned PMID

Bounty claim

/claim #3120

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 24, 2026

Greptile Summary

Adds a new self-contained Omi plugin app that exposes three PubMed chat tools (search_pubmed, get_pubmed_article, get_related_pubmed) backed by the unauthenticated NCBI E-utilities API, deployable to Railway or Heroku-style hosts.

  • All three tool endpoints use the synchronous requests library inside async def FastAPI handlers, blocking the event loop on every NCBI call (up to 20 s per TIMEOUT); under concurrent load the server will stall.
  • The search_pubmed manifest advertises max_results as 1–10 but the implementation allows up to 20, and PMID inputs are not validated as numeric before being forwarded to NCBI.

Confidence Score: 3/5

The app works correctly for single sequential requests but will degrade significantly under any concurrent load due to blocking HTTP calls in async handlers.

Every tool endpoint makes one or two synchronous requests.get() calls inside async def handlers, tying up the event loop thread for up to 20 seconds per call. A second concurrent request during that window cannot be processed. The get_related_pubmed endpoint makes two sequential blocking calls, doubling the stall window to 40 seconds. This is a structural issue in main.py that affects all production traffic, not just an edge case.

plugins/omi-pubmed-app/main.py needs the most attention — the blocking I/O pattern and the manifest/implementation inconsistency for max_results are both there.

Important Files Changed

Filename Overview
plugins/omi-pubmed-app/main.py Core FastAPI app with three PubMed tool endpoints; uses synchronous requests library inside async handlers, which blocks the event loop under concurrent load. Also has a manifest/implementation mismatch on max_results cap and no PMID format validation.
plugins/omi-pubmed-app/models.py Minimal Pydantic response model; straightforward and correct.
plugins/omi-pubmed-app/requirements.txt Pins FastAPI 0.104.1, uvicorn 0.24.0, requests 2.34.2, pydantic 2.5.2. No httpx dependency, which is needed if blocking I/O is replaced with async HTTP calls.
plugins/omi-pubmed-app/railway.toml Standard Railway deployment config; looks correct.
plugins/omi-pubmed-app/Procfile Valid Procfile for Heroku-style deployment.
plugins/omi-pubmed-app/README.md Basic README with tool descriptions and local run instructions; adequate.

Sequence Diagram

sequenceDiagram
    participant Omi as Omi Chat
    participant App as PubMed App (FastAPI)
    participant NCBI as NCBI E-utilities

    Omi->>App: POST /tools/search_pubmed
    App->>NCBI: GET esearch.fcgi (blocking)
    NCBI-->>App: idlist
    App->>NCBI: GET esummary.fcgi (blocking)
    NCBI-->>App: summaries
    App-->>Omi: ChatToolResponse

    Omi->>App: POST /tools/get_pubmed_article
    App->>NCBI: GET esummary.fcgi (blocking)
    NCBI-->>App: record
    App-->>Omi: ChatToolResponse

    Omi->>App: POST /tools/get_related_pubmed
    App->>NCBI: GET elink.fcgi (blocking)
    NCBI-->>App: linksets
    App->>NCBI: GET esummary.fcgi (blocking)
    NCBI-->>App: summaries
    App-->>Omi: ChatToolResponse
Loading

Reviews (1): Last reviewed commit: "Add PubMed no-auth chat tools plugin" | Re-trigger Greptile

Comment thread plugins/omi-pubmed-app/main.py Outdated
"retmax": max(1, min(retmax, 20)),
"sort": "relevance",
}
resp = requests.get(f"{EUTILS}/esearch.fcgi", params=params, timeout=TIMEOUT)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Synchronous HTTP calls block the async event loop

_search_ids, _fetch_summaries, and the direct requests.get call inside get_related_pubmed (line 211) are synchronous, but they are called from async def handlers. FastAPI runs on an asyncio event loop, so a synchronous requests.get that waits up to 20 seconds (TIMEOUT) blocks all concurrent requests from being processed. Under any concurrent load the server becomes unresponsive. Replace requests with httpx.AsyncClient and await the calls, or run the blocking helpers via asyncio.to_thread().

Comment thread plugins/omi-pubmed-app/main.py Outdated
"db": "pubmed",
"term": query,
"retmode": "json",
"retmax": max(1, min(retmax, 20)),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 max_results cap is inconsistent between manifest and implementation

The manifest advertises max_results as "1-10" for search_pubmed, but _search_ids clamps to min(retmax, 20), allowing up to 20. A caller trusting the manifest description and passing 10 will actually get up to 10 results, while the server silently allows twice that. The cap should match the documented range.

Suggested change
"retmax": max(1, min(retmax, 20)),
"retmax": max(1, min(retmax, 10)),

Comment thread plugins/omi-pubmed-app/main.py Outdated
Comment on lines +179 to +183
pmid = (body.get("pmid") or "").strip()
if not pmid:
return ChatToolResponse(error="pmid is required")

summaries = _fetch_summaries([pmid])
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 No PMID format validation

pmid is accepted as any arbitrary string and forwarded directly to the NCBI API. A non-numeric value (e.g. "abc") results in an NCBI API error that surfaces only as a generic "Failed to fetch" message. Validating that the value is numeric before the network call gives the caller a clearer error and avoids a wasted round-trip.

Suggested change
pmid = (body.get("pmid") or "").strip()
if not pmid:
return ChatToolResponse(error="pmid is required")
summaries = _fetch_summaries([pmid])
pmid = (body.get("pmid") or "").strip()
if not pmid:
return ChatToolResponse(error="pmid is required")
if not pmid.isdigit():
return ChatToolResponse(error="pmid must be a numeric PubMed ID")
summaries = _fetch_summaries([pmid])

Comment thread plugins/omi-pubmed-app/main.py Outdated
Comment on lines +211 to +229
link_resp = requests.get(
f"{EUTILS}/elink.fcgi",
params={
"dbfrom": "pubmed",
"db": "pubmed",
"id": pmid,
"linkname": "pubmed_pubmed",
"retmode": "json",
},
timeout=TIMEOUT,
)
link_resp.raise_for_status()
data = link_resp.json()
linksets = data.get("linksets", [])
related = []
if linksets:
dbs = linksets[0].get("linksetdbs", [])
if dbs:
related = [str(x) for x in dbs[0].get("links", [])[: max(1, min(max_results, 10))]]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 NCBI rate-limit errors are silently swallowed

NCBI E-utilities limits unauthenticated callers to 3 requests per second. Each call to get_related_pubmed makes two sequential NCBI requests (elink + esummary), and search_pubmed makes two as well. When the limit is exceeded, NCBI returns HTTP 429, raise_for_status() raises an exception, and the broad except clause returns a generic "Failed to fetch related PubMed articles: …" with no retry guidance. Adding a specific check for requests.exceptions.HTTPError with status 429 and returning a user-friendly "rate limited, try again" message would give callers actionable feedback.

@liangtovi-debug
Copy link
Copy Markdown
Author

Addressed the review points in follow-up commits:

  • Replaced blocking requests calls in async handlers with httpx.AsyncClient to avoid event-loop blocking under concurrency.
  • Aligned max_results handling to the advertised 1-10 range via a shared clamp helper.
  • Added numeric PMID validation (pmid.isdigit()) for get_pubmed_article and get_related_pubmed.
  • Updated dependencies accordingly (requests -> httpx).

Commits pushed to this PR branch:

Validation rerun: python3 -m py_compile on updated main.py.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant