Microservices that feed structured knowledge into GBrain (or any Obsidian-compatible vault).
GBrain handles storage, search, enrichment, and back-linking. These feeders watch external sources and push structured markdown into the Obsidian vault that GBrain syncs from.
Originally built to track an on-chain protocol's programs, SDKs, GitHub activity, Twitter/X handles, and internal Notion docs. The same pattern works for any team that wants a single searchable corpus over scattered sources.
External Sources Feeders Obsidian Vault GBrain
───────────────── ───────────────── ───────────────── ─────────────
Solana Programs ──→ idl-watcher ──→ ants/programs/ ──→ Search
GitHub Repos ──→ github-feed ──→ ants/changelog/ ──→ Enrichment
SDKs ──→ sdk-feed ──→ ants/sdks/ ──→ Back-links
On-chain Data ──→ analytics-feed ──→ ants/analytics/ ──→ Skills
Twitter / X ──→ twitter-feed ──→ ants/twitter/ ──→ Cron Jobs
Notion Docs ──→ notion-feed ──→ ants/notion/ ──→ Search
Public Docs ──→ docs-feed ──→ ants/docs/ ──→ ...
Meetings ──→ meeting-feed ──→ ants/meetings/ ──→ ...
Each feeder writes Obsidian-compatible markdown with YAML frontmatter. GBrain's sync picks up changes, chunks content, embeds vectors, and makes everything searchable via MCP.
| Feeder | Status | Description |
|---|---|---|
idl-watcher |
✅ | Watches on-chain Anchor programs, renders IDL to markdown, detects changes |
github-feed |
✅ | Tracks merged PRs, releases, and issues across configured repos |
sdk-feed |
✅ | Parses SDK public API surface, tracks breaking changes, maps methods → IDL instructions |
docs-feed |
✅ | Syncs public documentation, flags stale docs |
twitter-feed |
✅ | Tracks tweets from configured handles plus mentions |
notion-feed |
✅ | Indexes internal Notion design docs, QA plans, FE/BE specs, comms, and milestones. Needs NOTION_TOKEN plus [notion] allowlists. |
analytics-feed |
✅ | Pulls product metrics from Amplitude via declarative event-segmentation queries (DAU, volume, on-chain txs) per entity. Needs AMPLITUDE_API_KEY + AMPLITUDE_SECRET_KEY. |
meeting-feed |
🚧 | Transcribes meetings via whisper.cpp, extracts decisions |
cp .env.example .env # fill in keys + paths
cp ants.config.example.toml ants.config.toml # define what to track
bun installTwo files drive everything:
.env— secrets and per-machine paths (vault root, cache dir, API keys)ants.config.toml— the registry of what to track. Both files are gitignored; commit only the.exampleversions.
The config layout is hybrid: real products group their program + SDKs + tagged repos under one block; standalone repos and handles stay flat.
[products.<slug>] # product metadata (title, description)
[products.<slug>.program] # → idl-watcher (on-chain program)
[products.<slug>.sdks.<role>] # → sdk-feed (one entry per SDK lang)
[products.<slug>.repos.<role>] # → github-feed (one entry per tagged repo)
[repos.<slug>] # → github-feed (standalone repos)
[handles.<slug>] # → twitter-feed (handle snapshots)
[notion] # → notion-feed allowlists + default queriesAdding a new product is one block: paste [products.foo], then any combination of [products.foo.program], [products.foo.sdks.ts], and [products.foo.repos.<role>]. Repos that aren't tagged to a product (infra, services, third-party forks) live at the top level under [repos.<slug>].
Auto-derived entity slugs: <product> for the program, <product>-sdk for the TS SDK, <product>-<role> for everything else. Override per-subsection with entity = "..." to keep an existing vault page name. Operational config (metrics, TTLs, whisper glossary) and secrets (.env) stay where they are. Long-running daemons pick up edits on restart. Override the config path with ANTS_CONFIG.
notion-feed is deny-by-default. Set NOTION_TOKEN to a Notion integration token with read access to the workspace pages/databases you want indexed, then add those IDs to [notion].allowed_teamspaces or [notion].allowed_databases in ants.config.toml. Empty allowlists return a clean no-op. The orchestrator also runs a startup handshake; if the token is present but invalid or unreachable, Notion is disabled for that session instead of wedging later calls.
| Command | Description |
|---|---|
bun run idl-watcher:sync |
Snapshot every configured program's IDL from on-chain and write to the vault |
bun run idl-watcher:pr <url|owner/repo#N> |
Compare a PR's IDL against main and render the diff |
bun run github-feed:sync [entity] |
Pull merged PRs, releases, and issues for all repos (or one) |
bun run sdk-feed:sync |
Snapshot every SDK's public API surface from main |
bun run sdk-feed:pr <url|owner/repo#N> |
Compare an SDK PR's surface against base and flag breaking changes |
bun run docs-feed |
Sync public docs into the vault (incremental, uses cache) |
bun run docs-feed:full |
Full docs sync — ignore cache, fetch everything |
bun run docs-feed:dry-run |
Report docs changes without writing to the vault |
bun run analytics-feed:sync |
Pull product metrics from Amplitude and snapshot into the vault |
bun run analytics-feed:dry-run |
List registered metrics without hitting Amplitude or writing |
bun run meeting-feed:file <path> |
Transcribe a Craig recording (.zip or directory of .flac) into a meeting page |
bun run meeting-feed:watch [dir] |
Watch $ANTS_RECORDINGS/ (or arg) and auto-ingest every new .zip |
bun run meeting-feed:eval |
Score transcription quality (WER) against hand-corrected goldens. See feeders/meeting-feed/eval-README.md |
bun run twitter-feed:sync [entity] |
Snapshot tweets for all tracked handles (or one) plus configured mentions |
bun run twitter-feed:mentions |
Refresh just the mentions aggregator |
bun run notion-feed:sync [entity] |
Search configured Notion queries (or one entity term) and write scoped summaries under ants/notion/ |
bun run notion-feed:diag [query...] |
Dump raw Notion /search results with title + parent type/ID so you can see what UUIDs to add to [notion].allowed_teamspaces / allowed_databases. Does not write to the vault. |
bun run notion-feed:clean [--yes|--dry-run] |
One-shot reset of <vault>/ants/notion/. Run when scope inference or the slugifier changes and stale files would otherwise pile up. |
When transcripts come back wrong, you have five knobs:
- Glossary — add domain terms (product names, partner names, team handles) to
feeders/shared/whisper-prompt.toml. Whisper biases toward words it has seen in the prompt. - Per-meeting context — pass
--context "discussion of X"tomeeting-feed:file. Layered on top of the glossary, just for that meeting. - Team handles — uncomment and fill in real Discord/Slack display names in
whisper-prompt.toml's glossary list. - Model swap — point
WHISPER_MODELat a larger model (ggml-large-v3.bininstead of turbo) for higher accuracy at slower speed. - Eval before/after —
bun run meeting-feed:evalre-runs the pipeline against fixed samples and reports WER deltas. Without this you're tuning blind. See feeders/meeting-feed/eval-README.md.
The orchestrator (feeders/orchestrator/main.ts) is the MCP server that
exposes research plus per-feeder tools to any MCP client (Claude Code,
Claude Desktop, ChatGPT custom GPTs, scripts).
It runs in two modes:
| Mode | When to use | Auth |
|---|---|---|
stdio (default) |
Local clients on the host, or remote clients spawned via SSH | SSH key |
http |
Remote clients reaching the host over the internet via a tunnel — no inbound SSH, no shared keys | Authorization: Bearer $ANTS_MCP_TOKEN |
# 1. Generate a strong token and add it to .env
echo "ANTS_MCP_TOKEN=$(openssl rand -hex 32)" >> .env
echo "ANTS_MCP_TRANSPORT=http" >> .env
# 2. Install launchd agents (macOS) — feeders + orchestrator
./scripts/launchd/install.sh
# 3. Confirm the orchestrator is listening locally
curl -s http://127.0.0.1:7777/health
# → ok
# 4. Expose it publicly. Pick one:
# 4a. Tailscale Funnel — backed by your tailnet
tailscale funnel --bg --https=443 http://localhost:7777
tailscale funnel status
# → https://<machine>.<tailnet>.ts.net is now publicly reachable
# 4b. ngrok — stable reserved domain, useful when clients can't join the
# tailnet (ChatGPT Enterprise, Claude.ai web, etc.)
# Sign up at https://dashboard.ngrok.com, reserve a static domain, then:
echo "NGROK_AUTHTOKEN=<from-dashboard>" >> .env
echo "NGROK_DOMAIN=<your-subdomain>.ngrok.app" >> .env
./scripts/launchd/install.sh --label ngrok
# → https://<your-subdomain>.ngrok.app/mcp is now publicly reachableFunnel runs as part of tailscaled, so it persists across reboots without
its own launchd agent. Tear it down with tailscale funnel --https=443 off.
ngrok runs under its own launchd agent (KeepAlive=true). Tear it down with
./scripts/launchd/install.sh --uninstall --label ngrok. Logs at
~/.ants/logs/ngrok.log.
Drop this block into your MCP client config:
{
"mcpServers": {
"ants": {
"url": "https://<machine>.<tailnet>.ts.net/mcp",
"headers": {
"Authorization": "Bearer <your ANTS_MCP_TOKEN>"
}
}
}
}Smoke test from any laptop:
curl -s -X POST https://<machine>.<tailnet>.ts.net/mcp \
-H "Authorization: Bearer $ANTS_MCP_TOKEN" \
-H "Content-Type: application/json" \
-H "Accept: application/json, text/event-stream" \
-d '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}'# 1. Edit .env on the host, set a new ANTS_MCP_TOKEN
# 2. Restart the daemon so it picks up the new value
launchctl kickstart -k gui/$(id -u)/<your-launchd-label>
# 3. Update every client's config with the new token- Anyone with the token can call every tool the orchestrator exposes — triggering feeder runs, ingesting recordings, writing into the vault. They cannot read arbitrary files outside the vault, and they cannot run shell commands. Treat the token like an API key with full vault write access.
- The orchestrator binds
127.0.0.1by default. Funnel terminates TLS at Tailscale's edge and proxies inside the tailnet tolocalhost:7777. Without Funnel, only processes on the host can reach/mcp. /healthis unauthenticated by design (Funnel and uptime probes need it). It returns no information about the vault or the token.- Rotate the token regularly, or immediately if a client laptop is lost. There is no per-client revocation — one token, all clients.
Every page uses GBrain's compiled truth + timeline pattern:
- Compiled truth (above
---): Current synthesized state. Gets overwritten on each run. - Timeline (below
---): Append-only evidence trail. Never edited, only prepended to.
This means a program page always shows the latest IDL state at the top, with a history of all changes below.
Skills are reusable agent workflows that live alongside the feeders. The skill system (loader, registry, resolver, runner, MCP tools skill_list / skill_create / skill_validate / skill_reload / skill_resolve) is generic — see skills/AUTHORING.md for how to author one. No skills ship with this repo; bring your own.
The original implementation (ClickHouse-based with custom search) is archived in v1/. v2 and later versions will be shipped to this repo in batches.
MIT — see LICENSE.