Build grounded AI subject-matter experts from multi-source corpora.
Give Peritus a topic. It discovers authoritative sources across the web, has Claude validate and score each one, ingests and embeds the survivors, extracts a concept graph over the content, and generates a named expert persona you can converse with — every answer cited back to the passages it came from.
Peritus is two components:
api/— a Python 3.12 / FastAPI server that runs the build pipeline and the retrieval-augmented chat, streaming progress and tokens over Server-Sent Events.cli/— a Rust ratatui terminal UI that talks to the server: browse experts, kick off builds with a live log, and chat.
Storage is PostgreSQL + pgvector. There is no external vector database — the corpus, the concept graph, and the chunk embeddings all live in Postgres.
build "stoic philosophy" (tier: lite · standard · pro)
- Plan — Claude turns the topic into a tailored search query for each source fetcher and names the 5–8 core concepts the corpus must cover.
- Discover — every fetcher runs concurrently: Wikipedia, Project Gutenberg, ArXiv, PDFs (Mistral OCR), YouTube transcripts, Exa neural search, general web, Reddit, and curated thought-leaders. High-citation references from discovered ArXiv papers are snowballed in via Semantic Scholar.
- Validate — Claude scores each source for quality and relevance against a versioned rubric; sources below threshold are dropped, with the reason recorded.
- Chunk & embed — survivors are chunked, given Anthropic-style contextual prefixes, and embedded with OpenAI
text-embedding-3-large(3072-dim). - Graph extract — Claude reads the chunks in batches and extracts typed concept nodes and relationships (including
contradictsedges). Semantically duplicate nodes are then merged via embedding similarity. - Persona — Claude reads a digest of the accepted sources and the top concepts and writes a named expert persona: name, bio, and a concrete speaking/citation style.
Every passed and dropped source — with its quality/relevance scores, validator model, and rubric version — is persisted, so each expert carries a verifiable record of what it was built from.
Each question is answered through a grounded retrieval loop:
- Plan subqueries from the question.
- Hybrid search every subquery in parallel — semantic (pgvector) fused with keyword (Postgres full-text) via reciprocal-rank fusion, then optionally reranked (Cohere cross-encoder, or a windowed LLM fallback).
- Graph expand the hits with neighbouring concepts and relationships from the concept graph.
- Coverage check — Claude judges whether the retrieved passages answer the question and, if not, suggests follow-up queries for a second retrieval pass.
- Compose — the deduplicated passages are numbered and handed to Claude under a strict grounding contract: answer only from the passages, cite every claim with its
[n]. - Stream the answer token-by-token, then resolve the citation list down to only the passages the answer actually cited.
A tier sets the depth/cost trade-off for both build and chat (api/.../experts/domain.py):
| Tier | Sources | Subqueries | Graph hops | Context passages | Response tokens |
|---|---|---|---|---|---|
| lite | ~10 | 2 | 1 | 8 | 1024 |
| standard | ~20 | 4 | 1 | 15 | 2048 |
| pro | ~40 | 6 | 2 | 25 | 4096 |
- Python 3.12+
- Rust (stable) — only to build the TUI client
- PostgreSQL with the
pgvectorextension ANTHROPIC_API_KEY— validation, graph extraction, persona, chatOPENAI_API_KEY— embeddings- Optional:
EXA_API_KEY(Exa neural search + YouTube discovery),MISTRAL_API_KEY(PDF OCR),COHERE_API_KEY(cross-encoder reranking)
# 1. Configure
cp api/.env.example api/.env # then fill in DATABASE_URL + API keys
# 2. Install the API and apply migrations
cd api
pip install -e .
python migrations/apply.py
# 3. Build the Rust TUI client
cd ../cli
cargo build --releaseKey environment variables (api/src/peritus/core/config.py):
| Variable | Purpose | Default |
|---|---|---|
DATABASE_URL |
Postgres connection string | — |
ANTHROPIC_API_KEY |
Claude (validation, graph, persona, chat) | — |
OPENAI_API_KEY |
Embeddings | — |
CLAUDE_MODEL |
Chat + persona model | claude-sonnet-4-6 |
FAST_MODEL |
Planning, contextualisation, coverage, validation | claude-haiku-4-5-20251001 |
GRAPH_MODEL |
Graph extraction | claude-haiku-4-5-20251001 |
EMBED_MODEL / EMBED_DIM |
OpenAI embedding model / dimension | text-embedding-3-large / 3072 |
EXA_API_KEY |
Exa + YouTube discovery (optional) | — |
MISTRAL_API_KEY |
PDF OCR (optional) | — |
COHERE_API_KEY |
Cross-encoder reranking (optional) | — |
SUPABASE_URL |
Supabase project URL; set it to require login | — |
SUPABASE_ANON_KEY |
Anon/publishable key (server-side only, for OTP proxy) | — |
SUPABASE_JWT_SECRET |
Legacy HS256 secret (only if not on JWKS signing keys) | — |
BOOTSTRAP_ADMIN_EMAIL |
Admin email — sees pre-auth (owner-less) experts | — |
PERITUS_ENV |
production refuses to start with auth disabled (fail-closed) |
development |
AUTH_ALLOW_SIGNUP |
false = invite-only (unknown emails can't self-provision) |
true |
AUTH_RATE_LIMIT / AUTH_RATE_WINDOW |
Per-IP cap on /auth/otp + /auth/verify (requests / seconds) |
10 / 60 |
CORS_ALLOW_ORIGINS |
Comma-separated browser origins allowed by CORS | http://localhost:3000,http://localhost:8000 |
PERITUS_API_KEY_HASH |
SHA-256 of a legacy static API key (superseded by Supabase auth) | — |
The repo ships a Justfile with the common commands:
just dev-solo # API server + in-process build worker (single-process local dev)
just dev # API server only (uvicorn, :8000, --reload)
just worker # standalone build worker (production shape: run beside `just dev`)
just migrate # apply database migrations
just test # pytest
just lint # ruff + mypy
just build-cli # cargo build --release
just run-cli # cargo run (the TUI)
just docker-up # docker compose up --build -d (api + worker services)
just docker-downBuilds execute in a durable Postgres-backed job queue, so something must run a
worker: either just dev-solo (worker inside the API process) or just dev plus
just worker as two processes.
Typical flow: start the server (just dev-solo), then launch the TUI (just run-cli). On first run the TUI shows a config screen — point it at the server URL (default http://localhost:8000). If the server has auth enabled, the TUI then shows a sign-in screen (see below). From the home screen you can create an expert (topic + tier) and watch the build log live, then open it to chat.
Peritus uses Supabase Auth. Users sign in with an email one-time code — no passwords, no browser, works entirely in the terminal — and every expert is owned by the user who built it. Each user sees and chats with only their own experts.
How it fits together. The API is a backend-for-frontend: it holds the Supabase anon key and proxies the sign-in calls, so clients only ever handle the resulting session tokens. Access tokens (JWTs) are verified locally against the project's JWKS endpoint (asymmetric ES256/RS256 signing keys, the current Supabase default), falling back to the legacy HS256 shared secret if that's all the project has. See api/src/peritus/api/auth.py.
Enable it by setting SUPABASE_URL, SUPABASE_ANON_KEY, and BOOTSTRAP_ADMIN_EMAIL (plus SUPABASE_JWT_SECRET only if the project hasn't migrated to signing keys). With none of these set, the server runs in open dev mode — no login required, and requests act as the admin. In production, set PERITUS_ENV=production: the server then refuses to start with auth disabled, so a missing SUPABASE_URL can never silently drop every request into admin mode.
One-time Supabase setup: in the dashboard, set the Magic Link email template to send a code — include {{ .Token }} in the template (a 6-digit code) rather than only a magic link, since the TUI verifies the code directly.
Open vs. invite-only. By default anyone who can reach the server can sign in and gets their own private workspace (AUTH_ALLOW_SIGNUP=true). To run Peritus as an invite-only workspace, set AUTH_ALLOW_SIGNUP=false and add users from the Supabase dashboard — unknown emails then can't self-provision. The /auth/otp and /auth/verify endpoints are also per-IP rate-limited (AUTH_RATE_LIMIT / AUTH_RATE_WINDOW).
Sign in from the TUI: enter your email → receive a code by email → enter the code. The session (access + rotating refresh token) is saved to the client config with 0600 permissions and refreshed automatically; press L on the home screen to sign out. Signing out (TUI L or peritus logout) revokes the session server-side, so the refresh token can't be reused.
Sign in from the Python CLI:
peritus login # prompts for email, then the 6-digit code
peritus whoami # show the signed-in user
peritus logout # revoke the session server-side + clear the local cacheExperts built with peritus build are owned by the signed-in user; experts that predate auth (owner-less rows) are visible to BOOTSTRAP_ADMIN_EMAIL.
A legacy static key (
PERITUS_API_KEY_HASH+Authorization: Bearer <key>) is still accepted as a fallback credential, but Supabase login is the recommended path.
When auth is enabled, expert endpoints require a Supabase access token via Authorization: Bearer <token>; the /auth/* endpoints are public (they are the login flow).
| Method | Path | Description |
|---|---|---|
GET |
/health, /ready |
Liveness / DB readiness |
GET |
/auth/status |
Whether this server requires login |
POST |
/auth/otp |
Send an email one-time code |
POST |
/auth/verify |
Exchange a code for a session |
POST |
/auth/refresh |
Rotate a refresh token for a new session |
POST |
/auth/logout |
Revoke the caller's session (refresh tokens) |
GET |
/auth/me |
The current authenticated user |
GET |
/experts |
List the caller's experts |
GET |
/experts/{slug} |
Expert detail (sources, counts, persona) |
POST |
/experts/build |
Build an expert — SSE stream of progress |
GET |
/experts/{slug}/build/events?after=N |
Reconnect to a build's progress from a cursor — SSE |
GET |
/experts/{slug}/build/status |
Point-in-time build job status |
POST |
/experts/{slug}/build/cancel |
Cancel the active build |
DELETE |
/experts/{slug} |
Delete an expert (cancels any in-flight build) |
POST |
/experts/{slug}/chat |
Ask a question — SSE stream of tokens + citations |
api/
src/peritus/
api/ FastAPI app, routes (incl. /auth), schemas, JWT verification
cli/ Python CLI (build/chat + login/logout/whoami)
experts/ build pipeline coordinator, tiers, repository (owner-scoped)
sources/ fetchers (wikipedia, arxiv, exa, web, …) + Claude validator
ingestion/ chunking, contextualisation, embed pipeline
graph/ concept-graph extraction, storage, retrieval
search/ hybrid semantic + keyword search service
chat/ grounded chat agent, grounding contract, faithfulness
eval/ offline golden-set harness + retrieval/answer metrics
infrastructure/ Postgres pool, embeddings, reranker, Anthropic client, PDF OCR
migrations/ SQL migrations + apply.py
cli/
src/
api/ HTTP + SSE client
tui/ ratatui screens (home, build, chat, config, login) and widgets
config/ on-disk client config (server URL + saved session)