___---___
.-- --.
./ () .-. \.
/ o . ( ) \
/ . '-' \ _ ____ _
| () . O . | | | _ _ _ __ __ _| _ \ ___ _ _| |_ ___
| | | | | | | | '_ \ / _` | |_) / _ \| | | | __/ _ \
| o () | | |__| |_| | | | | (_| | _ < (_) | |_| | || __/
| .--. O | |_____\__,_|_| |_|\__,_|_| \_\___/ \__,_|\__\___|
| . | | |
\ `.__.' o . /
\ /
`\ o () /
`--___ ___--'
---
Your AI Coding Assistant's Best Friend
A blazing-fast local proxy for AI coding assistants that gives you complete visibility into every LLM interaction. Zero configuration, sub-millisecond overhead, and powerful debugging capabilities.
eval $(lunaroute-server env)That's it! This single command:
- β Starts LunaRoute server in the background
- β
Configures Claude Code (sets
ANTHROPIC_BASE_URL) - β
Configures Codex CLI (sets
OPENAI_BASE_URL) - β Accepts both OpenAI and Anthropic formats simultaneously
- β Tracks every token, tool call, and conversation
Start coding with your AI assistant immediately - both APIs are ready to use!
If you prefer to run the server manually:
# Terminal 1: Start the server
lunaroute-server
# Terminal 2: Point your AI tools to it
export ANTHROPIC_BASE_URL=http://localhost:8081 # For Claude Code
export OPENAI_BASE_URL=http://localhost:8081/v1 # For Codex CLIThat's it. No API keys to configure, no YAML files to write, nothing. LunaRoute automatically:
- β Accepts both OpenAI and Anthropic formats simultaneously
- β‘ Runs in dual passthrough mode (zero normalization overhead)
- π Uses your existing API keys (from env vars or client headers)
- π Tracks every token, tool call, and conversation
- π― Routes requests based on model prefix (
gpt-*β OpenAI,claude-*β Anthropic)
Stop flying blind. LunaRoute records every interaction with zero configuration:
- π Debug AI conversations - See exactly what your assistant sends and receives
- π° Track token usage - Input, output, and thinking tokens broken down by session
- π§ Analyze tool performance - Which tools are slow? Which get used most?
- π Measure overhead - Is it the LLM or your code that's slow?
- π Search past sessions - "How did the AI solve that bug last week?"
# Literally just one command
eval $(lunaroute-server env)What you get instantly:
- β‘ Dual API support - OpenAI
/v1/chat/completions+ Anthropic/v1/messages - π Passthrough mode - Sub-millisecond overhead, 100% API fidelity
- π Automatic auth - Uses environment variables or client-provided keys
- π Session tracking - SQLite database + JSONL logs for deep analysis
- π¨ Web UI - Browse sessions at
http://localhost:8082 - π Background server - Detached process, continues running even after terminal closes
How it works:
$ lunaroute-server env
export ANTHROPIC_BASE_URL=http://127.0.0.1:8081
export OPENAI_BASE_URL=http://127.0.0.1:8081/v1
# LunaRoute server started on http://127.0.0.1:8081
# Web UI available at http://127.0.0.1:8082
$ eval $(lunaroute-server env) # Sets env vars and starts server
$ # Now use your AI tools normally - they're automatically configured!- π PII redaction - Auto-detect and redact emails, SSN, credit cards, phone numbers
- β‘ 0.1-0.2ms overhead - Sub-millisecond proxy latency in passthrough mode
- π‘οΈ Zero trust storage - Redact before hitting disk, not after
- π Local first - All data stays on your machine
Download the latest release for your platform from GitHub Releases:
# Linux/macOS: Extract and run
tar -xzf lunaroute-server-*.tar.gz
chmod +x lunaroute-server
# Optional: Add to PATH for global access
sudo mv lunaroute-server /usr/local/bin/
# Start using it immediately!
eval $(lunaroute-server env)git clone https://github.com/erans/lunaroute.git
cd lunaroute
cargo build --release --package lunaroute-server
# Binary location: target/release/lunaroute-server
# Start using it!
eval $(./target/release/lunaroute-server env)The fastest way to get started - automatically starts server and configures your shell:
# One command does everything!
eval $(lunaroute-server env)
# Now use your AI tools - they're automatically configured
# Both Claude Code and Codex CLI work immediatelyWhat happens:
- Server starts in background on port 8081
ANTHROPIC_BASE_URLset tohttp://127.0.0.1:8081OPENAI_BASE_URLset tohttp://127.0.0.1:8081/v1- Web UI available at
http://127.0.0.1:8082
Custom port:
eval $(lunaroute-server env --port 8090)Stop the server:
pkill -f "lunaroute-server serve"If you prefer to manage the server yourself:
# Terminal 1: Start LunaRoute
lunaroute-server
# Terminal 2: Configure your shell
export ANTHROPIC_BASE_URL=http://localhost:8081
export OPENAI_BASE_URL=http://localhost:8081/v1
# Now use Claude Code or Codex CLI
# View sessions at http://localhost:8082What you get out of the box:
β OpenAI provider enabled (no API key - will use client auth)
β Anthropic provider enabled (no API key - will use client auth)
π‘ API dialect: Both (OpenAI + Anthropic)
β‘ Dual passthrough mode: OpenAIβOpenAI + AnthropicβAnthropic (no normalization)
- gpt-* models β OpenAI provider (passthrough)
- claude-* models β Anthropic provider (passthrough)
π Bypass enabled for unknown API paths
π Session recording enabled (SQLite + JSONL)
π¨ Web UI available at http://localhost:8082
Need more control? Use a config file:
# Save as config.yaml
host: "127.0.0.1"
port: 8081
api_dialect: "both" # Already the default!
providers:
openai:
enabled: true
base_url: "https://api.openai.com/v1" # Or use ChatGPT backend
# api_key: "sk-..." # Optional: defaults to OPENAI_API_KEY env var
anthropic:
enabled: true
# api_key: "sk-ant-..." # Optional: defaults to ANTHROPIC_API_KEY env var
session_recording:
enabled: true
sqlite:
enabled: true
path: "~/.lunaroute/sessions.db"
jsonl:
enabled: true
directory: "~/.lunaroute/sessions"
retention:
max_age_days: 30
max_size_mb: 1024
ui:
enabled: true
host: "127.0.0.1"
port: 8082lunaroute-server --config config.yamlLunaRoute accepts both OpenAI and Anthropic formats simultaneously with zero normalization:
- OpenAI format at
/v1/chat/completionsβ routes to OpenAI - Anthropic format at
/v1/messagesβ routes to Anthropic - Zero overhead - ~0.1-0.2ms added latency
- 100% API fidelity - preserves extended thinking, all response fields
- No normalization - direct passthrough to native API
Track everything that matters with dual storage:
SQLite Database - Fast queries and analytics:
SELECT model_used, COUNT(*), SUM(input_tokens), SUM(output_tokens)
FROM sessions
WHERE started_at > datetime('now', '-7 days')
GROUP BY model_used;JSONL Logs - Human-readable, full request/response data:
# Watch live sessions
tail -f ~/.lunaroute/sessions/$(date +%Y-%m-%d)/session_*.jsonl | jq
# Search for specific content
grep -r "TypeError" ~/.lunaroute/sessions/Get detailed breakdowns on shutdown or via API:
π Session Statistics Summary
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Session: 550e8400-e29b-41d4-a716-446655440000
Requests: 5
Input tokens: 2,450
Output tokens: 5,830
Thinking tokens: 1,200
Total tokens: 9,480
Tool usage:
Read: 12 calls (avg 45ms)
Write: 8 calls (avg 120ms)
Bash: 3 calls (avg 850ms)
Performance:
Avg response time: 2.3s
Proxy overhead: 12ms total (0.5%)
Provider latency: 2.288s (99.5%)
π° Estimated cost: $0.14 USD
Protect sensitive data automatically before it hits disk:
session_recording:
pii:
enabled: true
detect_email: true
detect_phone: true
detect_ssn: true
detect_credit_card: true
redaction_mode: "tokenize" # mask, remove, tokenize, or partialBefore: My email is john.doe@example.com and SSN is 123-45-6789
After: My email is [EMAIL:a3f8e9d2] and SSN is [SSN:7b2c4f1a]
24 metric types at /metrics:
- Request rates (total, success, failure)
- Latency histograms (P50, P95, P99)
- Token usage (input/output/thinking)
- Tool call statistics
- Streaming performance
Perfect for Grafana dashboards.
Built-in web interface for browsing sessions:
# Automatically available at http://localhost:8082
lunaroute-serverFeatures:
- π Dashboard with filtering and search
- π Session details with timeline view
- π Raw JSON inspection
- π Token usage and performance analytics
The env command starts the server in the background and outputs shell commands to configure your environment:
# Basic usage - starts server on default port 8081
eval $(lunaroute-server env)
# Custom port
eval $(lunaroute-server env --port 8090)
# Custom host and port
eval $(lunaroute-server env --host 0.0.0.0 --port 8090)
# Check what it does without executing
lunaroute-server env
# Output:
# export ANTHROPIC_BASE_URL=http://127.0.0.1:8081
# export OPENAI_BASE_URL=http://127.0.0.1:8081/v1
# # LunaRoute server started on http://127.0.0.1:8081
# # Web UI available at http://127.0.0.1:8082What happens:
- Server starts in background (detached from terminal)
- Environment variables are set in your current shell
- Server continues running even if you close the terminal
- Both Claude Code and Codex CLI are instantly configured
Stop the server:
pkill -f "lunaroute-server serve"Control behavior without config files:
# API dialect (default: both)
export LUNAROUTE_DIALECT=both # openai, anthropic, or both
# Provider API keys (optional - can use client headers)
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
# Session recording
export LUNAROUTE_ENABLE_SESSION_RECORDING=true
export LUNAROUTE_ENABLE_SQLITE_WRITER=true
export LUNAROUTE_SESSIONS_DB_PATH="~/.lunaroute/sessions.db"
export LUNAROUTE_ENABLE_JSONL_WRITER=true
export LUNAROUTE_SESSIONS_DIR="~/.lunaroute/sessions"
# Logging
export LUNAROUTE_LOG_LEVEL=info # trace, debug, info, warn, error
export LUNAROUTE_LOG_REQUESTS=false
# Server settings
export LUNAROUTE_HOST=127.0.0.1
export LUNAROUTE_PORT=8081
# UI
export LUNAROUTE_UI_ENABLED=true
export LUNAROUTE_UI_PORT=8082Tune HTTP client performance:
providers:
openai:
http_client:
timeout_secs: 600 # Request timeout (default: 600)
connect_timeout_secs: 10 # Connection timeout (default: 10)
pool_max_idle_per_host: 32 # Pool size (default: 32)
pool_idle_timeout_secs: 600 # Idle timeout (default: 600)
tcp_keepalive_secs: 60 # Keepalive (default: 60)
max_retries: 3 # Retries (default: 3)Or use environment variables:
export LUNAROUTE_OPENAI_TIMEOUT_SECS=300
export LUNAROUTE_OPENAI_POOL_MAX_IDLE=64See Connection Pool Configuration for details.
LunaRoute can automatically notify users when requests are routed to alternative providers due to rate limits, errors, or circuit breaker events.
Features:
- π Automatic user notifications via LLM response
- ποΈ Global on/off with per-provider customization
- π Works with cross-dialect failover (OpenAI β Claude)
- π Template variables for customization
- π‘οΈ Idempotent (no duplicate notifications)
Configuration:
routing:
provider_switch_notification:
enabled: true
default_message: |
IMPORTANT: Please inform the user that due to temporary service constraints,
their request is being handled by an alternative AI service provider.
Continue with their original request.
providers:
anthropic-backup:
type: "anthropic"
# Custom message when THIS provider is used as alternative
switch_notification_message: |
Using Claude due to ${reason}. Quality remains the same.Template Variables:
${original_provider}- Provider that failed${new_provider}- Provider being used${reason}- Generic reason (high demand, service issue, maintenance)${model}- Model name
See examples/configs/provider-switch-notification.yaml for complete example.
Problem: Your AI session cost $5 but you don't know why.
Solution: Check session stats to see the AI's output was extremely verbose.
Problem: Your AI assistant feels slow.
Solution: Session statistics reveal Bash commands take 850ms on average - optimize those!
Problem: Team shares one proxy but everyone has different API keys.
Solution: LunaRoute uses client-provided auth - no shared secrets needed.
Problem: Need to log sessions but can't store PII.
Solution: Enable automatic PII redaction - all sensitive data removed before hitting disk.
- Config Examples - Pre-built configs for common scenarios
- Server README - Complete configuration reference
- Claude Code Guide - Claude Code integration
- Connection Pool Configuration - HTTP client tuning
- PII Detection - PII redaction details
- β Claude Code - Full passthrough support, zero config
- β OpenAI Codex CLI - Automatic auth.json integration
- β OpenCode - Standard OpenAI/Anthropic API compatibility
- β Custom Clients - Any tool using OpenAI or Anthropic APIs
- Added latency: 0.1-0.2ms (P95 < 0.5ms)
- Memory overhead: ~2MB baseline + ~1KB per request
- CPU usage: <1% idle, <5% at 100 RPS
- API fidelity: 100% (zero-copy proxy)
- Added latency: 0.5-1ms (async, non-blocking)
- Disk I/O: Batched writes every 100ms
- Storage: ~10KB per request (uncompressed), ~1KB (compressed)
- Test coverage: 73.35% (2042/2784 lines)
- Unit tests: 544 passing
- Integration tests: 11 test files
- Clippy warnings: 0
LunaRoute is built as a modular Rust workspace:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Claude Code / OpenAI Codex CLI / OpenCode β
βββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββ
β HTTP/SSE
β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β LunaRoute Proxy β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Ingress (Anthropic/OpenAI endpoints) β β
β βββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββ β
β β β
β βββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββ β
β β Dual Passthrough Mode (Zero-copy, 100% fidelity) β β
β βββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββ β
β β β
β βββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββ β
β β Session Recording (JSONL + SQLite, PII redaction) β β
β βββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββ β
β β β
β βββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββ β
β β Metrics & Statistics (Prometheus, session stats) β β
β βββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββ β
β β β
β βββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββ β
β β Egress (Provider connectors) β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββ
β HTTP/SSE
β
βββββββββββββββββββββββββ βββββββββββββββββββ
β OpenAI API β β Anthropic API β
β (api.openai.com) β β (api.anth...) β
βββββββββββββββββββββββββ βββββββββββββββββββ
Key Crates:
lunaroute-core- Types and traitslunaroute-ingress- HTTP endpoints (OpenAI, Anthropic)lunaroute-egress- Provider connectors with connection poolinglunaroute-session- Recording and searchlunaroute-pii- PII detection/redactionlunaroute-observability- Metrics and healthlunaroute-server- Production binary
We welcome contributions! Whether it's:
- Bug reports and fixes
- New PII detectors
- Additional metrics
- Documentation improvements
- Performance optimizations
Please see CONTRIBUTING.md for guidelines.
Licensed under the Apache License, Version 2.0 (LICENSE).
Like the moon π guides travelers at night, LunaRoute illuminates your AI interactions. Every request, every token, every decision - visible and trackable.
Built with β€οΈ for developers who want visibility, control, and performance.
Give your AI coding assistant the visibility it deserves.
lunaroute-server