From 98af64ca18e750623ad950d7daa573065398f9d8 Mon Sep 17 00:00:00 2001
From: Jeremy Drouillard <jeremy@stacklok.com>
Date: Fri, 17 Apr 2026 14:34:01 -0700
Subject: [PATCH 1/4] RFC-0058: thv llm subcommand for LLM gateway
 authentication

Adds RFC proposing a `thv llm` command group that bridges AI coding tools
to OIDC-protected LLM gateways via a localhost reverse proxy and token
helper, with auto-wiring of client applications.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---
 rfcs/THV-0058-thv-llm-subcommand.md | 658 ++++++++++++++++++++++++++++
 1 file changed, 658 insertions(+)
 create mode 100644 rfcs/THV-0058-thv-llm-subcommand.md

diff --git a/rfcs/THV-0058-thv-llm-subcommand.md b/rfcs/THV-0058-thv-llm-subcommand.md
new file mode 100644
index 0000000..66e6775
--- /dev/null
+++ b/rfcs/THV-0058-thv-llm-subcommand.md
@@ -0,0 +1,658 @@
+# RFC-0058: `thv llm` Subcommand for LLM Gateway Authentication
+
+- **Status**: Draft
+- **Author(s)**: Jeremy Drouillard (@jerm-dro)
+- **Created**: 2026-04-17
+- **Last Updated**: 2026-04-17
+- **Target Repository**: toolhive
+- **Related Issues**: N/A
+
+## Summary
+
+ToolHive gains a `thv llm` command group that bridges AI coding tools to
+OIDC-protected LLM gateways. Two authentication modes cover the full spectrum of
+tools: a localhost reverse proxy for tools that only accept static API keys, and a
+token helper for tools with native OIDC support. A single `thv llm setup` command
+detects installed tools, configures them, and handles the OIDC login — mirroring
+ToolHive's existing auto-wiring of MCP tools into client applications.
+
+## Problem Statement
+
+AI coding tools need API keys or tokens to make LLM requests. In enterprise
+environments, these requests route through a gateway that authenticates developers
+via OIDC. This creates a compatibility gap:
+
+- **OIDC-capable tools** (Claude Code, Codex CLI, avante.nvim) support dynamic
+  token commands — they can invoke an external program to fetch a fresh token before
+  each request. These tools just need a token helper wired in and pointed at the
+  gateway.
+- **Static-key-only tools** (Roo Code, Cline, Cursor, Continue, Aider, Zed) accept
+  a single API key and base URL at configuration time. They cannot perform OAuth
+  flows, store refresh tokens, or handle token expiry. When the token expires, the
+  tool breaks silently.
+
+Today there is no standard way for developers to bridge this gap. They resort to
+manual token copy-paste, custom scripts, or giving up on enterprise gateways
+entirely.
+
+ToolHive already auto-wires MCP tools into client applications — detecting installed
+clients, writing their configuration, and managing the connection lifecycle.
+Extending this to LLM gateway access is a natural next step. ToolHive also has OIDC
+and OAuth infrastructure, a secrets provider for credential storage, and a CLI that
+developers already use. It is the natural home for solving this problem.
+
+## Goals
+
+- Transparent OIDC authentication for static-key-only tools via a localhost reverse
+  proxy that injects fresh tokens on every request.
+- A token helper for OIDC-capable tools that prints a fresh access token to stdout,
+  suitable for use as an `apiKeyHelper` or `auth.command`.
+- Auto-detect installed AI coding tools and configure them to route through the
+  proxy or directly to the gateway in a single command.
+- A shared OIDC session across both modes — one browser login covers all tools.
+- Upstream-agnostic design that works with any LLM gateway accepting OIDC JWTs.
+
+## Non-Goals
+
+- Backend gateway implementation details — this RFC covers the client-side CLI only.
+  The upstream gateway is treated as an opaque HTTPS endpoint that accepts OIDC
+  bearer tokens.
+- Formal daemon lifecycle management — `proxy stop`, `proxy status`, auto-restart,
+  process supervision. The proxy runs in the background when started by `setup` and
+  is stopped by `teardown`, but richer daemon controls are future work.
+- Exhaustive tool coverage — Claude Code and Cursor are explored in this RFC to
+  illustrate the two authentication modes (direct token helper and proxy
+  respectively). Support for additional tools will be evaluated on a case-by-case
+  basis.
+
+## Proposed Solution
+
+### User Experience
+
+The primary entry point is `thv llm setup`. A developer configuring LLM gateway
+access for the first time runs:
+
+```
+$ thv llm setup \
+    --gateway-url https://llm.example.com \
+    --issuer https://auth.example.com \
+    --client-id my-client-id
+```
+
+This single command:
+
+1. **Saves the gateway connection config** — gateway URL, OIDC issuer, client ID,
+   and defaults are persisted in ToolHive's config file.
+2. **Detects installed AI coding tools** — scans for known tool config directories
+   (e.g., `~/.claude/`, `~/.cursor/`).
+3. **Configures each detected tool**:
+   - For OIDC-capable tools (Claude Code): writes a token helper that calls
+     `thv llm token`, sets the gateway as the base URL.
+   - For static-key-only tools (Cursor): sets the proxy's localhost address as the
+     base URL with a placeholder API key.
+4. **Starts the localhost proxy** in the background (if any static-key tool was
+   configured).
+5. **Triggers the OIDC browser login** — the developer authenticates once, and the
+   refresh token is stored securely for future sessions.
+
+After setup, all configured tools work immediately. Token refresh is automatic and
+transparent.
+
+To reverse the configuration:
+
+```
+$ thv llm teardown
+```
+
+This restores each tool's original configuration from backups, stops the background
+proxy if running, and optionally purges cached OIDC tokens with `--purge-tokens`.
+Teardown can also target specific tools: `thv llm teardown cursor`.
+
+### Command Structure
+
+```
+thv llm
+├── setup          # Detect tools, configure them, start proxy, trigger OIDC login
+├── teardown       # Reverse setup — restore configs, stop proxy
+├── config
+│   ├── set        # Manually adjust gateway URL, OIDC settings, ports
+│   ├── show       # Display current LLM config (text or JSON)
+│   └── reset      # Clear all LLM config and cached tokens
+├── proxy
+│   └── start      # Start the proxy in the foreground (for debugging)
+└── token          # (hidden) Print fresh access token to stdout
+```
+
+**Why each command exists:**
+
+- **`setup` / `teardown`** are the primary commands most users interact with. They
+  handle the full lifecycle.
+- **`config set/show/reset`** give advanced users manual control over gateway
+  connection settings without re-running setup. Useful for changing the gateway URL,
+  adjusting OIDC scopes, or switching environments.
+- **`proxy start`** runs the proxy in the foreground with full log output. This is a
+  debugging tool — when something isn't working, running the proxy interactively
+  lets the developer see request/response flow and token acquisition in real time.
+- **`token`** is a hidden subcommand. It is not intended to be called by users
+  directly — it exists as the target for tool-specific token helpers
+  (`apiKeyHelper`, `auth.command`). It prints a single JWT to stdout and exits.
+
+### High-Level Design
+
+```mermaid
+flowchart TD
+    subgraph "thv llm setup"
+        S[Setup] --> D[Detect installed tools]
+        D --> CD[Configure Direct tools<br/>Claude Code]
+        D --> CP[Configure Proxy tools<br/>Cursor]
+        CD --> HS[Write helper script<br/>→ thv llm token]
+        CP --> SP[Start background proxy]
+        S --> OL[Trigger OIDC login]
+    end
+
+    subgraph "Token Source"
+        TS[TokenSource]
+        TS --> |1| MC[In-memory cache]
+        TS --> |2| SR[Secrets provider<br/>restore refresh token]
+        TS --> |3| BF[Browser OIDC+PKCE flow]
+    end
+
+    subgraph "Direct Mode — OIDC-capable tools"
+        CC[Claude Code] --> |apiKeyHelper| TH[thv llm token]
+        TH --> TS
+        TH --> |JWT to stdout| CC
+        CC --> |Authorization: Bearer JWT| GW[LLM Gateway]
+    end
+
+    subgraph "Proxy Mode — Static-key tools"
+        CU[Cursor] --> |static key + request| PX[Localhost Proxy :14000]
+        PX --> |strip auth| PX
+        PX --> TS
+        PX --> |inject Bearer JWT| GW
+    end
+```
+
+Both modes share the same `TokenSource` and OIDC session. A single browser login
+covers all configured tools.
+
+### Token Source
+
+The token source manages OIDC token acquisition and refresh with a three-tier
+strategy:
+
+1. **In-memory cache** — If a valid (non-expired) access token exists in memory,
+   return it immediately. No I/O, no network.
+2. **Secrets provider restore** — If no in-memory token exists, attempt to load the
+   refresh token from the secrets provider (OS keyring or encrypted file, under the
+   `ScopeLLM` prefix). Use it to obtain a fresh access token from the OIDC provider.
+3. **Browser OIDC+PKCE flow** — If no refresh token is available (first login or
+   revoked), launch an authorization code flow with PKCE (S256) via the system
+   browser. The user authenticates, and both access and refresh tokens are returned.
+
+After any successful token acquisition:
+
+- The access token is cached in memory only — never written to disk.
+- The refresh token is stored in the secrets provider for future sessions.
+- The `oauth2.TokenSource` wrapper handles automatic refresh before expiry.
+
+**Preemptive refresh:** The token source refreshes proactively when the access token
+is within 30 seconds of expiry, avoiding request failures at the boundary.
+
+**Non-interactive mode:** The `thv llm token` command runs non-interactively — it
+will not launch a browser flow. If no cached or restorable token is available, it
+returns an error. The browser flow is only triggered during `setup` or `proxy start`
+(interactive commands).
+
+### Proxy
+
+The localhost reverse proxy accepts unauthenticated HTTP requests from static-key
+tools and forwards them to the upstream gateway with a fresh OIDC token injected.
+
+**Request flow:**
+
+1. AI tool sends request to `http://localhost:14000/v1/chat/completions` with a
+   placeholder API key.
+2. Proxy strips the incoming `Authorization` header.
+3. Proxy calls `TokenSource.Token()` to get a fresh access token.
+4. Proxy sets `Authorization: Bearer <access_token>` on the outgoing request.
+5. Proxy forwards to the upstream gateway configured at startup (the destination is
+   fixed in the config, never derived from the incoming request). The original
+   request path and query string are appended to the configured gateway URL.
+6. Response streams back to the tool — SSE streaming is preserved without buffering
+   (`FlushInterval: -1`).
+
+**Key properties:**
+
+- **Loopback-only binding** — The proxy validates at startup that the listen address
+  resolves to a loopback interface. Binding to `0.0.0.0` or non-loopback addresses
+  is rejected.
+- **Health endpoint** — `GET /healthz` returns `{"status":"ok","upstream":"..."}` for
+  liveness checks.
+- **OpenAI-compatible errors** — Token acquisition failures return HTTP 401 with an
+  error body matching the OpenAI API schema, so tools display meaningful messages.
+- **TLS skip verify** — An optional `--tls-skip-verify` flag for development
+  environments with self-signed certificates. Disabled by default.
+- **SSE streaming** — The proxy uses `httputil.ReverseProxy` with `FlushInterval: -1`
+  to stream Server-Sent Events without buffering, which is critical for chat
+  completion responses.
+
+### Token Helper
+
+`thv llm token` is a hidden subcommand that prints a single fresh JWT to stdout and
+exits. It is the integration point for OIDC-capable tools:
+
+- **Claude Code**: configured via `apiKeyHelper` in `~/.claude/settings.json` — Claude
+  Code calls the helper before each API request.
+- **Codex CLI**: configured via `auth.command` — same pattern.
+- **avante.nvim**: configured via a `cmd:` prefix in the API key field.
+
+The command runs non-interactively: it will use cached or refreshed tokens but will
+not launch a browser flow. During execution, stdout is reserved exclusively for the
+JWT — all other output (update checks, warnings, progress) is redirected to stderr
+to prevent corrupting the token value.
+
+### Setup and Teardown
+
+#### Tool Detection
+
+Setup scans for installed AI coding tools by checking for known configuration
+directories and files. Each supported tool defines:
+
+- **Detect** — How to determine if the tool is installed (e.g., does
+  `~/.claude/settings.json` or `~/.claude/` exist?).
+- **Apply** — How to configure the tool to use the gateway (what config keys to set,
+  what values).
+- **Revert** — How to restore the tool's original configuration on teardown.
+
+#### Two Tool Kinds
+
+**Direct (OIDC-capable tools like Claude Code):**
+
+- A helper script is generated at `~/.config/toolhive/bin/thv-llm-token` that
+  invokes `thv llm token`.
+- The tool's config is patched to set the helper script as its token command
+  (`apiKeyHelper` for Claude Code) and the gateway URL as its base URL
+  (`ANTHROPIC_BASE_URL`).
+- Requests go directly from the tool to the gateway — no proxy needed.
+
+**Proxy (static-key-only tools like Cursor):**
+
+- The tool's config is patched to set the localhost proxy address
+  (`http://localhost:14000/v1`) as its base URL and a placeholder key
+  (`sk-toolhive-proxy`) as its API key.
+- The background proxy must be running for these tools to function.
+
+#### Helper Script
+
+OIDC-capable tools like Claude Code don't call `thv llm token` directly — they
+invoke a helper script that wraps it. The indirection is needed because these tools
+expect a path to an executable in their config (e.g., `apiKeyHelper` in Claude
+Code's `settings.json`). The helper script provides a stable path that setup writes
+once; if the `thv` binary moves or the invocation needs to change, only the script
+is updated — no re-patching of tool configs required.
+
+Setup generates the script at `~/.config/toolhive/bin/thv-llm-token`:
+
+```bash
+#!/bin/bash
+exec thv llm token
+```
+
+#### Config Patching
+
+Tool configuration files (JSON, JSONC) are modified in place using ToolHive's
+existing config editing infrastructure (`pkg/client/config_editor.go`). This uses
+hujson for lenient JSON parsing (preserves comments and formatting) and RFC 6902
+JSON Patch operations for adds and removes. All writes are atomic (write to temp
+file, then rename) and protected by dual-layer file locking (in-process mutex +
+OS-level advisory lock).
+
+This is the same mechanism ToolHive uses to auto-wire MCP servers into client
+applications — no new patching approach is introduced.
+
+#### Config Modification and Removal
+
+Following ToolHive's existing convention for MCP auto-wiring, client config files
+are modified in place without backups. Setup applies RFC 6902 "add" patches to set
+the relevant keys, and teardown applies "remove" patches to undo them. This is
+consistent with how `pkg/client/` handles MCP server registration and removal
+today.
+
+#### Background Proxy Lifecycle
+
+When setup configures at least one proxy-mode tool, it starts the proxy as a
+background process:
+
+- The proxy is launched via `os/exec` as a detached process.
+- The PID is written to `~/.config/toolhive/llm-proxy.pid`.
+- Teardown sends SIGTERM to the process when no proxy-mode tools remain.
+
+### Adding New Tool Support
+
+Supporting a new AI coding tool requires:
+
+1. **Define detection** — How to check if the tool is installed (config directory,
+   binary path, etc.).
+2. **Define apply** — What config keys to set and what values (base URL, API key or
+   token helper path).
+3. **Define revert** — How to undo the apply step (remove the keys that were added).
+4. **Register the tool** — Add it to the tool registry so setup discovers it.
+
+No changes to the setup/teardown orchestration, proxy, token source, or CLI commands
+are needed. The orchestration iterates over all registered tools and calls their
+Detect/Apply/Revert functions uniformly.
+
+### Configuration Model
+
+LLM gateway settings are stored in ToolHive's existing config file
+(`~/.config/toolhive/config.yaml`) under a new `llm:` key, persisted via the same
+`UpdateConfig()` mechanism used by runtime configs and registry auth — atomic updates
+with file locking. The Go types live in `pkg/llm/config.go` to keep the LLM domain
+self-contained; `pkg/config/config.go` references them via a field on the top-level
+`Config` struct.
+
+```yaml
+llm:
+  gateway_url: https://llm.example.com
+  oidc:
+    issuer: https://auth.example.com
+    client_id: my-client-id
+    scopes:
+      - openid
+      - offline_access
+    audience: ""             # optional
+    callback_port: 0         # 0 = ephemeral (auto-assigned)
+  proxy:
+    listen_port: 14000
+  auth:
+    cached_refresh_token_ref: "__thv_llm_llm-refresh-token"
+    cached_token_expiry: "2026-04-17T12:00:00Z"
+  configured_tools:
+    - tool: claude-code
+      mode: direct
+      config_path: /home/user/.claude/settings.json
+    - tool: cursor
+      mode: proxy
+      config_path: /home/user/.cursor/settings.json
+```
+
+**Go types:**
+
+```go
+type LLMConfig struct {
+    GatewayURL      string          `yaml:"gateway_url"`
+    OIDC            LLMOIDCConfig   `yaml:"oidc"`
+    Proxy           LLMProxyConfig  `yaml:"proxy"`
+    Auth            LLMAuthState    `yaml:"auth"`
+    ConfiguredTools []LLMToolConfig `yaml:"configured_tools"`
+}
+
+type LLMOIDCConfig struct {
+    Issuer       string   `yaml:"issuer"`
+    ClientID     string   `yaml:"client_id"`
+    Scopes       []string `yaml:"scopes"`
+    Audience     string   `yaml:"audience,omitempty"`
+    CallbackPort int      `yaml:"callback_port,omitempty"`
+}
+
+type LLMProxyConfig struct {
+    ListenPort int `yaml:"listen_port"`
+}
+
+type LLMAuthState struct {
+    CachedRefreshTokenRef string    `yaml:"cached_refresh_token_ref"`
+    CachedTokenExpiry     time.Time `yaml:"cached_token_expiry"`
+}
+
+type LLMToolConfig struct {
+    Tool       string `yaml:"tool"`
+    Mode       string `yaml:"mode"`
+    ConfigPath string `yaml:"config_path"`
+}
+```
+
+`IsConfigured()` validates that the minimum required fields (gateway URL, issuer,
+client ID) are present. `Validate()` performs full validation including HTTPS
+enforcement, port range checks, and OIDC field requirements.
+
+### Package Layout
+
+```
+pkg/llm/
+├── doc.go                        # Package documentation
+├── llm.go                        # Public API surface (Setup, Teardown, NewProxy, NewTokenSource)
+├── config.go                     # LLMConfig, LLMOIDCConfig, LLMProxyConfig, LLMAuthState, LLMToolConfig
+└── internal/
+    ├── errors.go                 # OpenAI-compatible error responses
+    ├── errors_test.go
+    ├── proxy.go                  # Reverse proxy with token injection
+    ├── proxy_test.go
+    ├── setup.go                  # Tool registry, detection, apply/revert
+    ├── setup_test.go
+    ├── token_source.go           # 3-tier OIDC token acquisition
+    ├── token_source_test.go
+    └── validate.go               # Loopback address validation
+
+pkg/secrets/
+└── scoped.go                     # ScopeLLM constant (addition to existing file)
+
+cmd/thv/app/
+└── llm.go                        # CLI commands: setup, teardown, config, proxy start, token
+```
+
+Config types (`LLMConfig`, `LLMOIDCConfig`, etc.) live in `pkg/llm/config.go` rather
+than `pkg/config/` — this keeps the LLM domain self-contained and avoids duplicative
+types. Tests in `internal/` use these config types directly. The `internal/` package
+contains the implementation details — proxy, token source, tool registry, validation,
+and error formatting. The parent `pkg/llm/` exposes only the minimal public API
+needed by `cmd/thv/app/llm.go`. Most tests live in `internal/` alongside the code
+they exercise.
+
+## Security Considerations
+
+### Threat Model
+
+The primary threat is unauthorized access to the LLM gateway via stolen tokens or
+proxy misuse. The attack surface is:
+
+- **Local process access** — Any process on the developer's machine can reach the
+  localhost proxy.
+- **Token extraction** — If an attacker gains access to the secrets provider, they
+  could extract the refresh token.
+- **Network exposure** — If the proxy were bound to a non-loopback address, remote
+  hosts could use it as an authentication oracle.
+
+### Authentication and Authorization
+
+- OIDC Authorization Code flow with PKCE (S256) is used for all authentication.
+  This is the recommended flow for public clients (no client secret).
+- No client secret is required or stored.
+- The proxy does not authenticate incoming requests — it relies on the localhost
+  trust model (only local processes can reach it).
+
+### Data Security
+
+- **Access tokens** are held in memory only and never written to disk or logged.
+- **Refresh tokens** are stored in the OS keyring (macOS Keychain, Linux
+  Secret Service) via the secrets provider, falling back to AES-256-GCM encrypted
+  file storage. They are stored under the `ScopeLLM` prefix (`__thv_llm_`),
+  isolated from user-managed secrets.
+- **Token values are never logged** — log messages reference token metadata (expiry,
+  scope) but never the token string itself.
+
+### Input Validation
+
+- Gateway URL must use HTTPS (enforced by `Validate()`).
+- Listen address must resolve to a loopback interface — `ValidateLoopbackAddress()`
+  rejects `0.0.0.0`, non-loopback IPs, and hostnames that resolve to non-loopback
+  addresses.
+- OIDC configuration fields (issuer, client ID) are validated as non-empty and
+  well-formed.
+
+### Secrets Management
+
+- Refresh tokens are stored via ToolHive's existing secrets provider under the
+  `ScopeLLM` scope, leveraging the scoped secret store (RFC-0056).
+- Tokens are revocable — `thv llm config reset` and `thv llm teardown --purge-tokens`
+  delete cached tokens from the secrets provider.
+- The `thv llm token` command redirects all non-token output to stderr, preventing
+  accidental token leakage into log files or terminal scrollback.
+
+### Mitigations
+
+| Threat | Mitigation |
+|--------|------------|
+| Proxy accessible from network | Loopback-only binding enforced at startup |
+| Token theft from disk | Access tokens memory-only; refresh tokens in OS keyring |
+| Token leakage in logs | Token values never logged; stderr isolation in token helper |
+| Stale tokens after offboarding | `teardown --purge-tokens` clears all cached credentials |
+| Man-in-the-middle on gateway | HTTPS enforced for gateway URL; TLS verification on by default |
+
+## Alternatives Considered
+
+### Alternative 1: Standalone Binary
+
+Build the proxy and token helper as a separate binary, independent of ToolHive.
+
+- **Pros**: No dependency on ToolHive; simpler distribution for non-ToolHive users.
+- **Cons**: Duplicates ToolHive's OIDC, secrets provider, and config infrastructure.
+  Two binaries to install and update. Cannot leverage ToolHive's existing client
+  auto-wiring patterns.
+- **Why not chosen**: ToolHive already has every building block needed. A standalone
+  binary would mean maintaining parallel implementations of OAuth flows, secret
+  storage, and config management.
+
+### Alternative 2: Manual Configuration
+
+Do nothing — let developers manually configure each tool's base URL and API key to
+point at the gateway, and handle token refresh themselves.
+
+- **Pros**: No new code to write or maintain.
+- **Cons**: Error-prone and tedious. Developers must understand each tool's config
+  format, manually copy tokens, and deal with silent failures when tokens expire.
+  Does not scale across teams or tools.
+- **Why not chosen**: The manual approach is the status quo and is the problem this
+  RFC solves.
+
+## Compatibility
+
+### Backward Compatibility
+
+This is a purely additive change. The `thv llm` command group is new — no existing
+commands are modified. The `LLM` field in the config struct is optional and defaults
+to zero values (unconfigured).
+
+Existing ToolHive functionality (MCP tool management, secrets, registry) is
+unaffected.
+
+### Forward Compatibility
+
+- The tool registry is designed for extension — adding new tools requires
+  implementing Detect/Apply/Revert functions with no changes to orchestration.
+- The config model includes optional fields (audience, callback port) that
+  accommodate different IdP requirements.
+- Daemon mode, `proxy stop`/`status`, and richer tool lifecycle management can be
+  layered on in future RFCs without breaking the foundation established here.
+
+## Implementation Plan
+
+### Phase 1: Core Infrastructure
+
+Build the foundational components that all commands depend on:
+
+- **Config types** — `LLMConfig`, `LLMOIDCConfig`, `LLMProxyConfig`, `LLMAuthState`
+  in `pkg/llm/config.go`. Persisted via ToolHive's existing `UpdateConfig()` with
+  file locking, stored in `config.yaml` under the `llm:` key — the same persistence
+  pattern used by runtime configs and registry auth.
+- **Token source** — Three-tier acquisition in `pkg/llm/internal/token_source.go`.
+  Refresh tokens stored via the secrets provider under `ScopeLLM`.
+- **Reverse proxy** — `pkg/llm/internal/proxy.go` with token injection, SSE
+  streaming, loopback validation, health endpoint.
+- **CLI commands** — `config set/show/reset`, `proxy start` (foreground), `token`
+  (hidden).
+
+### Phase 2: Auto-Wiring
+
+Build the tool detection and auto-configuration layer on top of Phase 1:
+
+- **Tool detection and registry** — `pkg/llm/internal/setup.go` with Detect/Apply/Revert
+  functions for Claude Code (direct mode) and Cursor (proxy mode).
+- **Helper script generation** — Write `thv-llm-token` script for direct-mode tools.
+- **Client config patching** — Modify tool configs in place using ToolHive's existing
+  `pkg/client/config_editor.go` infrastructure. Teardown removes added keys via
+  RFC 6902 "remove" patches.
+- **Background proxy management** — Start/stop proxy as a detached process, PID
+  tracking.
+- **CLI commands** — `setup` and `teardown`.
+- **`configured_tools` tracking** — Record what setup configured in
+  `Config.LLM.ConfiguredTools` so teardown knows exactly what to reverse.
+
+### Dependencies
+
+- [RFC-0056: Scoped Secret Store](THV-0056-scoped-secret-store.md) — The `ScopeLLM`
+  secret scope for isolated token storage.
+- Existing `pkg/auth/oauth` — OIDC authorization code flow with PKCE.
+- Existing `pkg/secrets/` — Secrets provider (keyring + encrypted file fallback).
+- Existing `pkg/config/` — Config loading, saving, and file-locked updates.
+
+## Testing Strategy
+
+### Unit Tests
+
+- **Proxy** — `httptest` server as upstream. Verify token injection, header
+  stripping, SSE streaming passthrough, error handling (token failure → 401,
+  upstream failure → 502), health endpoint.
+- **Token source** — Mock OIDC provider and mock secrets provider. Verify the
+  three-tier fallback chain, preemptive refresh behavior, non-interactive mode
+  rejection.
+- **Config validation** — Test `IsConfigured()`, `Validate()`, HTTPS enforcement,
+  port range checks, default values.
+- **Setup/teardown** — Filesystem fixtures (temporary directories with mock tool
+  config files). Verify detection, config patching, key removal on teardown,
+  helper script content.
+
+### Adding Tests for New Tools
+
+Each tool's Detect/Apply/Revert functions are independently testable. When adding
+support for a new tool, the test pattern is:
+
+1. Create a fixture directory mimicking the tool's config layout.
+2. Test **Detect** — returns true when the fixture exists, false otherwise.
+3. Test **Apply** — verify the config file is patched with the expected keys/values.
+4. Test **Revert** — verify the added keys are removed from the config file.
+
+This pattern is self-contained per tool. No changes to proxy, token source, or
+orchestration tests are needed when adding a new tool.
+
+### Integration Tests
+
+- End-to-end proxy flow with a mock upstream server that validates the injected
+  bearer token and returns streamed responses.
+- Setup/teardown round-trip: run setup, verify tool configs are patched, run
+  teardown, verify added keys are removed.
+
+## Open Questions
+
+1. What is the priority of additional tool support beyond Claude Code and Cursor?
+
+## References
+
+- [RFC-0056: Scoped Secret Store](THV-0056-scoped-secret-store.md)
+
+---
+
+## RFC Lifecycle
+
+<!-- This section is maintained by RFC reviewers -->
+
+### Review History
+
+| Date | Reviewer | Decision | Notes |
+|------|----------|----------|-------|
+| 2026-04-17 | @jerm-dro | Draft | Initial submission |
+
+### Implementation Tracking
+
+| Repository | PR | Status |
+|------------|-----|--------|
+| toolhive | TBD | Not started |

From 9e0e200afbe2cd1668d89b67cddea6d919dfbc70 Mon Sep 17 00:00:00 2001
From: Jeremy Drouillard <jeremy@stacklok.com>
Date: Fri, 17 Apr 2026 14:54:28 -0700
Subject: [PATCH 2/4] Rename RFC to THV-0070 to match PR number

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---
 ...058-thv-llm-subcommand.md => THV-0070-thv-llm-subcommand.md} | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
 rename rfcs/{THV-0058-thv-llm-subcommand.md => THV-0070-thv-llm-subcommand.md} (99%)

diff --git a/rfcs/THV-0058-thv-llm-subcommand.md b/rfcs/THV-0070-thv-llm-subcommand.md
similarity index 99%
rename from rfcs/THV-0058-thv-llm-subcommand.md
rename to rfcs/THV-0070-thv-llm-subcommand.md
index 66e6775..040a036 100644
--- a/rfcs/THV-0058-thv-llm-subcommand.md
+++ b/rfcs/THV-0070-thv-llm-subcommand.md
@@ -1,4 +1,4 @@
-# RFC-0058: `thv llm` Subcommand for LLM Gateway Authentication
+# RFC-0070: `thv llm` Subcommand for LLM Gateway Authentication
 
 - **Status**: Draft
 - **Author(s)**: Jeremy Drouillard (@jerm-dro)

From c1c2f59393eba1aa662704b1eb797a758830cdfe Mon Sep 17 00:00:00 2001
From: Jeremy Drouillard <jeremy@stacklok.com>
Date: Fri, 17 Apr 2026 15:11:22 -0700
Subject: [PATCH 3/4] Address review feedback on RFC-0070

- Fix inconsistent teardown language (remove backup/restore references)
- Remove mermaid diagram
- Add Audit and Logging security subsection
- Remove all ScopeLLM references (implementation detail)
- Clarify token helper is preferred over background proxy
- Clarify config set vs setup relationship
- Add Documentation section (CLI help text only)
- Fix nitpick on Revert description

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---
 rfcs/THV-0070-thv-llm-subcommand.md | 103 ++++++++++------------------
 1 file changed, 38 insertions(+), 65 deletions(-)

diff --git a/rfcs/THV-0070-thv-llm-subcommand.md b/rfcs/THV-0070-thv-llm-subcommand.md
index 040a036..bb9fc48 100644
--- a/rfcs/THV-0070-thv-llm-subcommand.md
+++ b/rfcs/THV-0070-thv-llm-subcommand.md
@@ -104,16 +104,17 @@ To reverse the configuration:
 $ thv llm teardown
 ```
 
-This restores each tool's original configuration from backups, stops the background
-proxy if running, and optionally purges cached OIDC tokens with `--purge-tokens`.
-Teardown can also target specific tools: `thv llm teardown cursor`.
+This removes the LLM gateway configuration from each tool's config file, stops the
+background proxy if running, and optionally purges cached OIDC tokens with
+`--purge-tokens`. Teardown can also target specific tools:
+`thv llm teardown cursor`.
 
 ### Command Structure
 
 ```
 thv llm
 ├── setup          # Detect tools, configure them, start proxy, trigger OIDC login
-├── teardown       # Reverse setup — restore configs, stop proxy
+├── teardown       # Reverse setup — remove added config keys, stop proxy
 ├── config
 │   ├── set        # Manually adjust gateway URL, OIDC settings, ports
 │   ├── show       # Display current LLM config (text or JSON)
@@ -128,8 +129,9 @@ thv llm
 - **`setup` / `teardown`** are the primary commands most users interact with. They
   handle the full lifecycle.
 - **`config set/show/reset`** give advanced users manual control over gateway
-  connection settings without re-running setup. Useful for changing the gateway URL,
-  adjusting OIDC scopes, or switching environments.
+  connection settings. Both `setup` and `config set` write to the same config —
+  they are two paths to configure the `thv llm` commands. `config set` does not
+  modify running proxy instances or already-wired tool configs.
 - **`proxy start`** runs the proxy in the foreground with full log output. This is a
   debugging tool — when something isn't working, running the proxy interactively
   lets the developer see request/response flow and token acquisition in real time.
@@ -137,44 +139,6 @@ thv llm
   directly — it exists as the target for tool-specific token helpers
   (`apiKeyHelper`, `auth.command`). It prints a single JWT to stdout and exits.
 
-### High-Level Design
-
-```mermaid
-flowchart TD
-    subgraph "thv llm setup"
-        S[Setup] --> D[Detect installed tools]
-        D --> CD[Configure Direct tools<br/>Claude Code]
-        D --> CP[Configure Proxy tools<br/>Cursor]
-        CD --> HS[Write helper script<br/>→ thv llm token]
-        CP --> SP[Start background proxy]
-        S --> OL[Trigger OIDC login]
-    end
-
-    subgraph "Token Source"
-        TS[TokenSource]
-        TS --> |1| MC[In-memory cache]
-        TS --> |2| SR[Secrets provider<br/>restore refresh token]
-        TS --> |3| BF[Browser OIDC+PKCE flow]
-    end
-
-    subgraph "Direct Mode — OIDC-capable tools"
-        CC[Claude Code] --> |apiKeyHelper| TH[thv llm token]
-        TH --> TS
-        TH --> |JWT to stdout| CC
-        CC --> |Authorization: Bearer JWT| GW[LLM Gateway]
-    end
-
-    subgraph "Proxy Mode — Static-key tools"
-        CU[Cursor] --> |static key + request| PX[Localhost Proxy :14000]
-        PX --> |strip auth| PX
-        PX --> TS
-        PX --> |inject Bearer JWT| GW
-    end
-```
-
-Both modes share the same `TokenSource` and OIDC session. A single browser login
-covers all configured tools.
-
 ### Token Source
 
 The token source manages OIDC token acquisition and refresh with a three-tier
@@ -183,8 +147,8 @@ strategy:
 1. **In-memory cache** — If a valid (non-expired) access token exists in memory,
    return it immediately. No I/O, no network.
 2. **Secrets provider restore** — If no in-memory token exists, attempt to load the
-   refresh token from the secrets provider (OS keyring or encrypted file, under the
-   `ScopeLLM` prefix). Use it to obtain a fresh access token from the OIDC provider.
+   refresh token from the secrets provider (OS keyring or encrypted file). Use it to
+   obtain a fresh access token from the OIDC provider.
 3. **Browser OIDC+PKCE flow** — If no refresh token is available (first login or
    revoked), launch an authorization code flow with PKCE (S256) via the system
    browser. The user authenticates, and both access and refresh tokens are returned.
@@ -246,10 +210,12 @@ exits. It is the integration point for OIDC-capable tools:
 - **Codex CLI**: configured via `auth.command` — same pattern.
 - **avante.nvim**: configured via a `cmd:` prefix in the API key field.
 
-The command runs non-interactively: it will use cached or refreshed tokens but will
-not launch a browser flow. During execution, stdout is reserved exclusively for the
-JWT — all other output (update checks, warnings, progress) is redirected to stderr
-to prevent corrupting the token value.
+The token helper is the preferred integration mode — it does not require the
+background proxy to be running and handles token lifecycle independently. The command
+runs non-interactively: it will use cached or refreshed tokens but will not launch a
+browser flow. During execution, stdout is reserved exclusively for the JWT — all
+other output (update checks, warnings, progress) is redirected to stderr to prevent
+corrupting the token value.
 
 ### Setup and Teardown
 
@@ -262,7 +228,7 @@ directories and files. Each supported tool defines:
   `~/.claude/settings.json` or `~/.claude/` exist?).
 - **Apply** — How to configure the tool to use the gateway (what config keys to set,
   what values).
-- **Revert** — How to restore the tool's original configuration on teardown.
+- **Revert** — How to remove the configuration added by setup on teardown.
 
 #### Two Tool Kinds
 
@@ -365,7 +331,6 @@ llm:
   proxy:
     listen_port: 14000
   auth:
-    cached_refresh_token_ref: "__thv_llm_llm-refresh-token"
     cached_token_expiry: "2026-04-17T12:00:00Z"
   configured_tools:
     - tool: claude-code
@@ -400,8 +365,7 @@ type LLMProxyConfig struct {
 }
 
 type LLMAuthState struct {
-    CachedRefreshTokenRef string    `yaml:"cached_refresh_token_ref"`
-    CachedTokenExpiry     time.Time `yaml:"cached_token_expiry"`
+    CachedTokenExpiry time.Time `yaml:"cached_token_expiry"`
 }
 
 type LLMToolConfig struct {
@@ -433,9 +397,6 @@ pkg/llm/
     ├── token_source_test.go
     └── validate.go               # Loopback address validation
 
-pkg/secrets/
-└── scoped.go                     # ScopeLLM constant (addition to existing file)
-
 cmd/thv/app/
 └── llm.go                        # CLI commands: setup, teardown, config, proxy start, token
 ```
@@ -475,8 +436,7 @@ proxy misuse. The attack surface is:
 - **Access tokens** are held in memory only and never written to disk or logged.
 - **Refresh tokens** are stored in the OS keyring (macOS Keychain, Linux
   Secret Service) via the secrets provider, falling back to AES-256-GCM encrypted
-  file storage. They are stored under the `ScopeLLM` prefix (`__thv_llm_`),
-  isolated from user-managed secrets.
+  file storage.
 - **Token values are never logged** — log messages reference token metadata (expiry,
   scope) but never the token string itself.
 
@@ -491,13 +451,22 @@ proxy misuse. The attack surface is:
 
 ### Secrets Management
 
-- Refresh tokens are stored via ToolHive's existing secrets provider under the
-  `ScopeLLM` scope, leveraging the scoped secret store (RFC-0056).
+- Refresh tokens are stored via ToolHive's existing secrets provider.
 - Tokens are revocable — `thv llm config reset` and `thv llm teardown --purge-tokens`
   delete cached tokens from the secrets provider.
 - The `thv llm token` command redirects all non-token output to stderr, preventing
   accidental token leakage into log files or terminal scrollback.
 
+### Audit and Logging
+
+Basic audit logging for security-relevant operations:
+
+- **Proxy**: Logs each proxied request (method, path, upstream status code) without
+  logging token values or request/response bodies.
+- **Token execution**: Logs token acquisition events (cache hit, refresh, browser
+  flow triggered) and failures, with token expiry metadata but never the token itself.
+- **Setup/teardown**: Logs which tools were detected, configured, and removed.
+
 ### Mitigations
 
 | Threat | Mitigation |
@@ -565,7 +534,7 @@ Build the foundational components that all commands depend on:
   file locking, stored in `config.yaml` under the `llm:` key — the same persistence
   pattern used by runtime configs and registry auth.
 - **Token source** — Three-tier acquisition in `pkg/llm/internal/token_source.go`.
-  Refresh tokens stored via the secrets provider under `ScopeLLM`.
+  Refresh tokens stored via the secrets provider.
 - **Reverse proxy** — `pkg/llm/internal/proxy.go` with token injection, SSE
   streaming, loopback validation, health endpoint.
 - **CLI commands** — `config set/show/reset`, `proxy start` (foreground), `token`
@@ -589,11 +558,10 @@ Build the tool detection and auto-configuration layer on top of Phase 1:
 
 ### Dependencies
 
-- [RFC-0056: Scoped Secret Store](THV-0056-scoped-secret-store.md) — The `ScopeLLM`
-  secret scope for isolated token storage.
 - Existing `pkg/auth/oauth` — OIDC authorization code flow with PKCE.
 - Existing `pkg/secrets/` — Secrets provider (keyring + encrypted file fallback).
 - Existing `pkg/config/` — Config loading, saving, and file-locked updates.
+- Existing `pkg/client/` — Client config editing infrastructure.
 
 ## Testing Strategy
 
@@ -631,13 +599,18 @@ orchestration tests are needed when adding a new tool.
 - Setup/teardown round-trip: run setup, verify tool configs are patched, run
   teardown, verify added keys are removed.
 
+## Documentation
+
+CLI help text for all `thv llm` commands. No additional user documentation in this
+initial iteration.
+
 ## Open Questions
 
 1. What is the priority of additional tool support beyond Claude Code and Cursor?
 
 ## References
 
-- [RFC-0056: Scoped Secret Store](THV-0056-scoped-secret-store.md)
+None.
 
 ---
 

From 8916b5942b5326797a53a286233e364874d1528d Mon Sep 17 00:00:00 2001
From: Jeremy Drouillard <jeremy@stacklok.com>
Date: Wed, 22 Apr 2026 14:48:34 -0700
Subject: [PATCH 4/4] Add non-goals for unified SSO and HTTP API exposure

Address review feedback from @JAORMX and self-review:
- Unified MCP/LLM single sign-on is out of scope due to flow differences
- HTTP API exposure is future work; core logic designed to be reusable

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---
 rfcs/THV-0070-thv-llm-subcommand.md | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/rfcs/THV-0070-thv-llm-subcommand.md b/rfcs/THV-0070-thv-llm-subcommand.md
index bb9fc48..e338a39 100644
--- a/rfcs/THV-0070-thv-llm-subcommand.md
+++ b/rfcs/THV-0070-thv-llm-subcommand.md
@@ -64,6 +64,14 @@ developers already use. It is the natural home for solving this problem.
   illustrate the two authentication modes (direct token helper and proxy
   respectively). Support for additional tools will be evaluated on a case-by-case
   basis.
+- Unified single sign-on across MCP and LLM gateway authentication — the OIDC flows
+  for MCP tool authentication and LLM gateway access are fundamentally different.
+  Unifying them into a single login would add significant complexity. Two separate
+  logins is acceptable for now.
+- HTTP API exposure — the config model and core logic are designed to be reusable,
+  but this RFC does not define API endpoints for managing LLM gateway configuration.
+  Exposing these settings via the HTTP API (e.g., for UI-driven setup) is future
+  work.
 
 ## Proposed Solution