Skip to content

feat: add Tavily Extract as content extraction option in web_fetch#3

Open
tavily-integrations wants to merge 1 commit intosonpiaz:mainfrom
Tavily-FDE:feat/tavily-migration/web-fetch-tavily-extract
Open

feat: add Tavily Extract as content extraction option in web_fetch#3
tavily-integrations wants to merge 1 commit intosonpiaz:mainfrom
Tavily-FDE:feat/tavily-migration/web-fetch-tavily-extract

Conversation

@tavily-integrations
Copy link
Copy Markdown

Summary

  • Added Tavily Extract API as an optional content extraction provider in web_fetch, configurable via FETCH_PROVIDER env var
  • When FETCH_PROVIDER=tavily, Tavily Extract handles content extraction directly (falls back to Readability on failure)
  • When using default Readability provider, Tavily is used as a middle-tier fallback for low-quality extractions (< 200 chars) before Apify, reducing reliance on the paid Apify fallback
  • Uses extractDepth: "advanced" and format: "markdown" for best results on JS-heavy pages

Files changed

  • tools/web-fetch/index.ts — Added fetchViaTavily() function, Tavily extract branch in execute(), low-quality Readability fallback to Tavily, updated tool description
  • .env.example — Documented TAVILY_API_KEY and FETCH_PROVIDER variables
  • package.json — Added @tavily/core ^0.7.2 dependency

Dependency changes

  • Added @tavily/core ^0.7.2 to dependencies

Environment variable changes

  • TAVILY_API_KEY — Shared with web-search-tavily unit (required for Tavily extract)
  • FETCH_PROVIDER — Optional; values: readability (default) | tavily | apify

Notes for reviewers

  • All existing providers (Mozilla Readability, GetXAPI for X/Twitter, Apify) remain fully intact
  • Tavily import is lazy (await import(...)) to avoid loading the SDK when not needed
  • The fallback chain is: X/Twitter → Tavily (if provider=tavily) → Readability → Tavily (if low-quality + key available) → Apify (if js_render)

🤖 Generated with Claude Code

Automated Review

  • Passed after 1 attempt(s)
  • Final review: The implementation correctly adds Tavily Extract as an optional content extraction provider in web_fetch. The Tavily SDK is properly integrated with dynamic import (consistent with the existing Apify pattern), fallback logic is sound, dependencies are updated, and environment variables are documented. No critical or major issues found.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant