Conversation
There was a problem hiding this comment.
Pull request overview
This PR reworks Hadrian’s authentication configuration and runtime logic to use a single global [auth.mode] across gateway + admin endpoints, removing legacy “gateway vs admin” auth split and pushing JWT/SSO validation toward per-organization registries.
Changes:
- Introduces unified auth mode config (
none,api_key,idp,iap) and updates runtime auth plumbing accordingly (removing global OIDC authenticator and global JWT validator). - Extends SSRF URL validation with options to allow private IP ranges (while always blocking cloud metadata), and threads these options through OIDC/JWKS discovery paths.
- Updates UI OpenAPI/docs/deploy examples and lockfile overrides to align with the new auth model and dependency security updates.
Reviewed changes
Copilot reviewed 51 out of 52 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| ui/src/api/openapi.json | Updates embedded API documentation/config examples for new [auth.mode] layout. |
| ui/pnpm-lock.yaml | Updates dependency overrides and lock entries (minimatch, serialize-javascript, etc.). |
| ui/package.json | Updates dependency overrides matching lockfile changes. |
| src/wizard.rs | Updates wizard output/config generation to emit [auth.mode] + related sections. |
| src/validation/url.rs | Adds UrlValidationOptions and validate_base_url_opts, expands SSRF rules + tests. |
| src/validation/mod.rs | Re-exports new URL validation API. |
| src/routes/ws.rs | Migrates WS auth to new auth config accessors and shared session validation. |
| src/routes/execution.rs | Updates test AppState initialization for removed fields. |
| src/routes/auth.rs | Removes global OIDC fallback; standardizes session config lookup and shared session store usage. |
| src/routes/api.rs | Updates comments to reference auth.mode. |
| src/routes/admin/ui_config.rs | Derives UI auth methods from AuthMode. |
| src/routes/admin/sso_connections.rs | Reports SSO connection info based on AuthMode (per-org SSO now). |
| src/routes/admin/sessions.rs | Removes global OIDC authenticator fallback for session store lookup. |
| src/routes/admin/session_info.rs | Reports session/auth mode info based on AuthMode. |
| src/routes/admin/org_sso_configs.rs | Uses new SSRF validation options and threads allow_private_urls into validation/registry updates. |
| src/routes/admin/mod.rs | Updates inline config examples to [auth.mode]/[auth.api_key]/[auth.session]. |
| src/routes/admin/me_api_keys.rs | Uses api_key_config() for key generation prefix selection. |
| src/routes/admin/api_keys.rs | Uses api_key_config() for key generation prefix selection. |
| src/openapi.rs | Updates OpenAPI doc snippets for new auth config structure. |
| src/middleware/combined.rs | Refactors header auth flow to key off AuthMode; updates tests. |
| src/middleware/authz.rs | Updates documentation wording around “auth not configured”. |
| src/middleware/admin.rs | Switches admin auth to per-org registry/session store and updates SSRF options threading. |
| src/main.rs | Removes global OIDC/JWT validator from AppState; updates route wiring and startup logging. |
| src/config/server.rs | Adds allow_private_urls server option with docs/default. |
| src/config/mod.rs | Updates validation rules and feature checks for new auth mode + IAP safety checks. |
| src/config/auth.rs | Introduces AuthMode, IapConfig, and unified auth config accessors/validation/tests. |
| src/auth/session_store.rs | Updates config path references in session model docs. |
| src/auth/registry.rs | Loads only OIDC configs in OIDC registry initialization. |
| src/auth/oidc.rs | Threads allow_private through discovery/JWKS lookup. |
| src/auth/gateway_jwt.rs | Threads allow_private through per-org JWT registry building/lookup. |
| src/auth/discovery.rs | Uses validate_base_url_opts and threads allow_private for SSRF validation. |
| docs/content/docs/troubleshooting.mdx | Updates auth config examples to [auth.mode]/[auth.api_key]. |
| docs/content/docs/security/index.mdx | Renames proxy auth to IAP and aligns examples with new mode model. |
| docs/content/docs/features/sso-admin-guide.mdx | Updates IdP/JWT requirements text to auth.mode = "idp". |
| docs/content/docs/features/multi-tenancy.mdx | Updates session config paths in docs. |
| docs/content/docs/configuration/auth.mdx | Major rewrite: documents unified auth modes, new sections, and removes legacy split. |
| docs/content/docs/authentication.mdx | Major rewrite: describes single [auth.mode] approach and updated scenarios. |
| docs/content/docs/api/authentication.mdx | Updates API auth doc to reflect per-org JWT routing via idp mode. |
| deploy/config/hadrian.university.toml | Updates example config to new auth mode + SSRF allow_private/allow_loopback notes. |
| deploy/config/hadrian.traefik.toml | Updates auth config sections to new layout. |
| deploy/config/hadrian.sqlite.toml | Updates auth config sections to new layout. |
| deploy/config/hadrian.sqlite-redis.toml | Updates auth config sections to new layout. |
| deploy/config/hadrian.saml.toml | Updates example config to idp mode and SSRF allow_private for Docker setups. |
| deploy/config/hadrian.redis-cluster.toml | Updates auth config sections to new layout. |
| deploy/config/hadrian.provider-health.toml | Updates auth config sections to new layout. |
| deploy/config/hadrian.production.toml | Updates auth config sections to new layout. |
| deploy/config/hadrian.postgres.toml | Updates auth config sections and IAP example to new layout. |
| deploy/config/hadrian.postgres-ha.toml | Updates auth config sections to new layout. |
| deploy/config/hadrian.observability.toml | Updates auth config sections to new layout. |
| deploy/config/hadrian.keycloak.toml | Updates example config to idp mode + allow_private/allow_loopback + new sections. |
| deploy/config/hadrian.dlq.toml | Updates auth config sections to new layout. |
| Dockerfile | Forces fresh main-crate rebuild by removing cached artifacts in the build stage. |
Files not reviewed (1)
- ui/pnpm-lock.yaml: Language not supported
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| /// Whether admin routes should be protected by authentication middleware. | ||
| /// | ||
| /// Returns true for modes that have an admin auth mechanism (Idp uses sessions/bearer | ||
| /// tokens, Iap uses proxy headers). Returns false for `ApiKey` mode, where only gateway | ||
| /// (API) routes require keys and admin routes are unprotected — matching the legacy | ||
| /// behavior where `[auth.gateway]` could be set without `[auth.admin]`. | ||
| pub fn requires_admin_auth(&self) -> bool { | ||
| match self.mode { | ||
| AuthMode::None | AuthMode::ApiKey => false, | ||
| #[cfg(feature = "sso")] | ||
| AuthMode::Idp => true, | ||
| AuthMode::Iap(_) => true, | ||
| } |
There was a problem hiding this comment.
AuthMode::ApiKey is documented as “API key required everywhere”, but requires_admin_auth() returns false for ApiKey, which causes admin routes to be mounted without admin_auth_middleware and logged as “UNPROTECTED”. Either update the docs/comments to match the intended behavior, or change requires_admin_auth() (and admin auth implementation) so api_key mode actually protects /admin/* as well.
| @@ -2164,7 +2137,7 @@ pub fn build_app(config: &config::GatewayConfig, state: AppState) -> Router { | |||
| app = app.merge(Router::new().nest("/admin", admin_routes)); | |||
| } else { | |||
| tracing::warn!( | |||
| "Admin routes are UNPROTECTED - configure auth.admin for Zero Trust or OIDC authentication" | |||
| "Admin routes are UNPROTECTED - configure auth.mode type = \"idp\" or \"iap\" for authentication" | |||
| ); | |||
There was a problem hiding this comment.
build_app treats api_key mode as “admin routes unprotected” because it gates protection on config.auth.requires_admin_auth(). This conflicts with the new auth-mode docs/UI behavior that suggest api_key mode should still require credentials to access admin/control-plane endpoints. If api_key is meant to protect admin routes, this should mount protected routes + middleware for ApiKey too (or otherwise enforce auth on /admin/*).
src/middleware/admin.rs
Outdated
| @@ -1073,8 +1072,25 @@ async fn try_oidc_session_auth( | |||
| .parse() | |||
| .map_err(|_| AuthError::InvalidToken)?; | |||
|
|
|||
| // Get session from the OIDC authenticator's session store | |||
| let session = authenticator.get_session(session_id).await?; | |||
| // Get session from the registry's shared session store | |||
| let session = match registry.session_store().get_session(session_id).await { | |||
| Ok(Some(s)) => s, | |||
| Ok(None) => return Ok(None), | |||
| Err(e) => { | |||
| tracing::warn!( | |||
| session_id = %session_id, | |||
| error = %e, | |||
| "Failed to retrieve OIDC session" | |||
| ); | |||
| return Ok(None); | |||
| } | |||
| }; | |||
|
|
|||
| // Check if session has expired | |||
| if session.is_expired() { | |||
| let _ = registry.session_store().delete_session(session_id).await; | |||
| return Ok(None); | |||
| } | |||
There was a problem hiding this comment.
try_oidc_session_auth now fetches the session directly and only checks absolute expiration. This bypasses the shared validate_and_refresh_session logic (inactivity timeout, last_activity refresh, etc.), so enhanced sessions may not behave correctly and idle sessions may remain valid longer than intended. Consider switching to validate_and_refresh_session(registry.session_store().as_ref(), session_id, &session_config.enhanced) and mapping its errors as needed.
ui/src/api/openapi.json
Outdated
| "info": { | ||
| "title": "Hadrian Gateway API", | ||
| "description": "**Hadrian Gateway** is an AI Gateway providing a unified OpenAI-compatible API for routing requests to multiple LLM providers.\n\n## Overview\n\nThe gateway provides two main API surfaces:\n\n- **Public API** (`/api/v1/*`) - OpenAI-compatible endpoints for LLM inference. Use these endpoints to create chat completions, text completions, embeddings, and list available models. Requires API key authentication.\n\n- **Admin API** (`/admin/v1/*`) - RESTful management endpoints for multi-tenant configuration. Manage organizations, projects, users, API keys, dynamic providers, usage tracking, and model pricing.\n\n## Authentication\n\nThe gateway supports multiple authentication methods for API access.\n\n### API Key Authentication\n\nAPI keys are the primary authentication method for programmatic access. Keys are created via the Admin API and scoped to organizations, projects, or users.\n\n**Using the Authorization header (recommended):**\n```\nAuthorization: Bearer gw_live_abc123def456...\n```\n\n**Using the X-API-Key header:**\n```\nX-API-Key: gw_live_abc123def456...\n```\n\nBoth headers are supported. The `Authorization: Bearer` format is recommended for compatibility with OpenAI client libraries.\n\n**Example request:**\n```bash\ncurl https://gateway.example.com/api/v1/chat/completions \\\n -H \\\"Authorization: Bearer gw_live_abc123def456...\\\" \\\n -H \\\"Content-Type: application/json\\\" \\\n -d '{\\\"model\\\": \\\"openai/gpt-4\\\", \\\"messages\\\": [{\\\"role\\\": \\\"user\\\", \\\"content\\\": \\\"Hello\\\"}]}'\n```\n\n### JWT Authentication\n\nWhen JWT authentication is enabled, requests can be authenticated using a JWT token from your identity provider.\n\n```\nAuthorization: Bearer eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9...\n```\n\nThe gateway validates the JWT against the configured JWKS endpoint and extracts the identity from the token claims.\n\n**Example request:**\n```bash\ncurl https://gateway.example.com/api/v1/chat/completions \\\n -H \\\"Authorization: Bearer eyJhbGciOiJSUzI1NiIs...\\\" \\\n -H \\\"Content-Type: application/json\\\" \\\n -d '{\\\"model\\\": \\\"openai/gpt-4\\\", \\\"messages\\\": [{\\\"role\\\": \\\"user\\\", \\\"content\\\": \\\"Hello\\\"}]}'\n```\n\n### Multi-Auth Mode\n\nWhen configured for multi-auth, the gateway accepts both API keys and JWTs using **format-based detection**:\n\n- **X-API-Key header**: Always validated as an API key\n- **Authorization: Bearer header**: Uses format-based detection:\n - Tokens starting with the configured API key prefix (default: `gw_`) are validated as API keys\n - All other tokens are validated as JWTs\n\n**Important:** Providing both `X-API-Key` and `Authorization` headers simultaneously results in a 400 error (ambiguous credentials). Choose one authentication method per request.\n\n**Examples:**\n```bash\n# API key in X-API-Key header\ncurl -H \\\"X-API-Key: gw_live_abc123...\\\" https://gateway.example.com/v1/chat/completions\n\n# API key in Authorization: Bearer header (format-based detection)\ncurl -H \\\"Authorization: Bearer gw_live_abc123...\\\" https://gateway.example.com/v1/chat/completions\n\n# JWT in Authorization: Bearer header\ncurl -H \\\"Authorization: Bearer eyJhbGciOiJSUzI1NiIs...\\\" https://gateway.example.com/v1/chat/completions\n```\n\n### Authentication Errors\n\n| Error Code | HTTP Status | Description | Example Response |\n|------------|-------------|-------------|------------------|\n| `unauthorized` | 401 | No authentication credentials provided | `{\\\"error\\\": {\\\"code\\\": \\\"unauthorized\\\", \\\"message\\\": \\\"Authentication required\\\"}}` |\n| `ambiguous_credentials` | 400 | Both X-API-Key and Authorization headers provided | `{\\\"error\\\": {\\\"code\\\": \\\"ambiguous_credentials\\\", \\\"message\\\": \\\"Ambiguous credentials: provide either X-API-Key or Authorization header, not both\\\"}}` |\n| `invalid_api_key` | 401 | API key is invalid, malformed, or revoked | `{\\\"error\\\": {\\\"code\\\": \\\"invalid_api_key\\\", \\\"message\\\": \\\"Invalid API key\\\"}}` |\n| `not_authenticated` | 401 | JWT validation failed | `{\\\"error\\\": {\\\"code\\\": \\\"not_authenticated\\\", \\\"message\\\": \\\"Token validation failed\\\"}}` |\n| `forbidden` | 403 | Valid credentials but insufficient permissions | `{\\\"error\\\": {\\\"code\\\": \\\"forbidden\\\", \\\"message\\\": \\\"Insufficient permissions\\\"}}` |\n\n### Configuration Examples\n\n**API Key Authentication:**\n```toml\n[auth.gateway]\ntype = \\\"api_key\\\"\nheader_name = \\\"X-API-Key\\\" # Header to read API key from\nkey_prefix = \\\"gw_\\\" # Valid key prefix\ncache_ttl_secs = 60 # Cache key lookups for 60 seconds\n```\n\n**JWT Authentication:**\n```toml\n[auth.gateway]\ntype = \\\"jwt\\\"\nissuer = \\\"https://auth.example.com\\\"\naudience = \\\"gateway-api\\\"\njwks_url = \\\"https://auth.example.com/.well-known/jwks.json\\\"\nidentity_claim = \\\"sub\\\" # JWT claim for user identity\n```\n\n**Multi-Auth (both API key and JWT):**\n```toml\n[auth.gateway]\ntype = \\\"multi\\\"\n\n[auth.gateway.api_key]\nheader_name = \\\"X-API-Key\\\"\nkey_prefix = \\\"gw_\\\"\n\n[auth.gateway.jwt]\nissuer = \\\"https://auth.example.com\\\"\naudience = \\\"gateway-api\\\"\njwks_url = \\\"https://auth.example.com/.well-known/jwks.json\\\"\n```\n\n## Pagination\n\nAll Admin API list endpoints use **cursor-based pagination** for stable, performant navigation.\n\n**Query Parameters:**\n- `limit` (optional): Maximum records per page (default: 100, max: 1000)\n- `cursor` (optional): Opaque cursor from previous response's `next_cursor` or `prev_cursor`\n- `direction` (optional): `forward` (default) or `backward`\n\n**Response:**\n```json\n{\n \\\"data\\\": [...],\n \\\"pagination\\\": {\n \\\"limit\\\": 100,\n \\\"has_more\\\": true,\n \\\"next_cursor\\\": \\\"MTczMzU4MDgwMDAwMDphYmMxMjM0...\\\",\n \\\"prev_cursor\\\": null\n }\n}\n```\n\n## Model Routing\n\nModels can be addressed in several ways:\n\n- **Static routing**: `provider-name/model-name` routes to config-defined providers\n- **Dynamic routing**: `:org/{ORG}/{PROVIDER}/{MODEL}` routes to database-backed providers\n- **Default**: When no prefix is specified, routes to the default provider\n\n## Error Codes\n\nAll errors follow a consistent JSON format:\n\n```json\n{\n \\\"error\\\": {\n \\\"code\\\": \\\"error_code\\\",\n \\\"message\\\": \\\"Human-readable error message\\\",\n \\\"details\\\": { ... } // Optional additional context\n }\n}\n```\n\n### Authentication & Authorization Errors\n\n| Code | HTTP Status | Description |\n|------|-------------|-------------|\n| `unauthorized` | 401 | Missing or invalid API key/token |\n| `invalid_api_key` | 401 | API key is invalid, expired, or revoked |\n| `forbidden` | 403 | Valid credentials but insufficient permissions |\n| `not_authenticated` | 401 | Authentication required for this operation |\n\n### Rate Limiting & Budget Errors\n\n| Code | HTTP Status | Description |\n|------|-------------|-------------|\n| `rate_limit_exceeded` | 429 | Request rate limit exceeded. Check `Retry-After` header. |\n| `budget_exceeded` | 402 | Budget limit exceeded for the configured period. Details include `limit_cents`, `current_spend_cents`, and `period`. |\n| `cache_required` | 503 | Budget enforcement requires cache to be configured |\n\n### Request Validation Errors\n\n| Code | HTTP Status | Description |\n|------|-------------|-------------|\n| `validation_error` | 400 | Request body validation failed |\n| `bad_request` | 400 | Malformed request |\n| `routing_error` | 400 | Model routing failed (invalid model string or provider not found) |\n| `not_found` | 404 | Requested resource not found |\n| `conflict` | 409 | Resource already exists or conflicts with existing state |\n\n### Provider & Gateway Errors\n\n| Code | HTTP Status | Description |\n|------|-------------|-------------|\n| `provider_error` | 502 | Upstream LLM provider returned an error |\n| `request_failed` | 502 | Failed to communicate with upstream provider |\n| `circuit_breaker_open` | 503 | Provider circuit breaker is open due to repeated failures |\n| `response_read_error` | 500 | Failed to read provider response |\n| `response_builder` | 500 | Failed to build response from provider data |\n| `internal_error` | 500 | Internal server error |\n\n### Guardrails Errors\n\n| Code | HTTP Status | Description |\n|------|-------------|-------------|\n| `guardrails_blocked` | 400 | Content blocked by guardrails policy. Response includes `violations` array. |\n| `guardrails_timeout` | 504 | Guardrails evaluation timed out |\n| `guardrails_provider_error` | 502 | Error communicating with guardrails provider |\n| `guardrails_auth_error` | 502 | Authentication failed with guardrails provider |\n| `guardrails_rate_limited` | 429 | Guardrails provider rate limit exceeded |\n| `guardrails_config_error` | 500 | Invalid guardrails configuration |\n| `guardrails_parse_error` | 400 | Failed to parse content for guardrails evaluation |\n\n### Admin API Errors\n\n| Code | HTTP Status | Description |\n|------|-------------|-------------|\n| `database_required` | 503 | Database not configured (required for admin operations) |\n| `services_required` | 503 | Required services not initialized |\n| `not_configured` | 503 | Required feature or service not configured |\n| `database_error` | 500 | Database operation failed |\n\n## Rate Limiting\n\nThe gateway implements multiple layers of rate limiting to protect against abuse and ensure fair usage.\n\n### Rate Limit Types\n\n| Type | Scope | Default | Description |\n|------|-------|---------|-------------|\n| **Requests per minute** | API Key | 60 | Maximum requests per minute per API key |\n| **Requests per day** | API Key | Unlimited | Optional daily request limit per API key |\n| **Tokens per minute** | API Key | 100,000 | Maximum tokens processed per minute |\n| **Tokens per day** | API Key | Unlimited | Optional daily token limit |\n| **Concurrent requests** | API Key | 10 | Maximum simultaneous in-flight requests |\n| **IP requests per minute** | IP Address | 120 | Rate limit for unauthenticated requests |\n\n### Rate Limit Headers\n\nAll API responses include rate limit information in HTTP headers.\n\n#### Request Rate Limit Headers\n\n| Header | Description | Example |\n|--------|-------------|---------|\n| `X-RateLimit-Limit` | Maximum requests allowed in the current window | `60` |\n| `X-RateLimit-Remaining` | Requests remaining in the current window | `45` |\n| `X-RateLimit-Reset` | Seconds until the rate limit window resets | `42` |\n\n#### Token Rate Limit Headers\n\n| Header | Description | Example |\n|--------|-------------|---------|\n| `X-TokenRateLimit-Limit` | Maximum tokens allowed per minute | `100000` |\n| `X-TokenRateLimit-Remaining` | Tokens remaining in the current minute | `85000` |\n| `X-TokenRateLimit-Used` | Tokens used in the current minute | `15000` |\n| `X-TokenRateLimit-Day-Limit` | Maximum tokens allowed per day (if configured) | `1000000` |\n| `X-TokenRateLimit-Day-Remaining` | Tokens remaining today (if configured) | `950000` |\n\n#### Rate Limit Exceeded Response\n\nWhen a rate limit is exceeded, the API returns HTTP 429 with:\n\n```json\n{\n \\\"error\\\": {\n \\\"code\\\": \\\"rate_limit_exceeded\\\",\n \\\"message\\\": \\\"Rate limit exceeded: 60 requests per minute\\\",\n \\\"details\\\": {\n \\\"limit\\\": 60,\n \\\"window\\\": \\\"minute\\\",\n \\\"retry_after_secs\\\": 42\n }\n }\n}\n```\n\nThe `Retry-After` header indicates seconds to wait before retrying:\n\n```\nHTTP/1.1 429 Too Many Requests\nRetry-After: 42\nX-RateLimit-Limit: 60\nX-RateLimit-Remaining: 0\nX-RateLimit-Reset: 42\n```\n\n### IP-Based Rate Limiting\n\nUnauthenticated requests (requests without a valid API key) are rate limited by IP address. This protects public endpoints like `/health` from abuse.\n\n- **Default:** 120 requests per minute per IP\n- **Client IP Detection:** Respects `X-Forwarded-For` and `X-Real-IP` headers when trusted proxies are configured\n- **Configuration:** Can be disabled or adjusted via `limits.rate_limits.ip_rate_limits` in config\n\n### Rate Limit Configuration\n\nRate limits are configured hierarchically:\n\n1. **Global defaults** (in `hadrian.toml`):\n```toml\n[limits.rate_limits]\nrequests_per_minute = 60\ntokens_per_minute = 100000\nconcurrent_requests = 10\n\n[limits.rate_limits.ip_rate_limits]\nenabled = true\nrequests_per_minute = 120\n```\n\n2. **Per-API key** limits can override global defaults (when creating API keys via Admin API)\n\n### Best Practices\n\n- **Implement exponential backoff**: When receiving 429 responses, wait the `Retry-After` duration before retrying\n- **Monitor rate limit headers**: Track `X-RateLimit-Remaining` to proactively throttle requests\n- **Use streaming for long responses**: Streaming responses don't hold connections during generation\n- **Batch requests when possible**: Combine multiple small requests into larger batches\n", | ||
| "description": "**Hadrian Gateway** is an AI Gateway providing a unified OpenAI-compatible API for routing requests to multiple LLM providers.\n\n## Overview\n\nThe gateway provides two main API surfaces:\n\n- **Public API** (`/api/v1/*`) - OpenAI-compatible endpoints for LLM inference. Use these endpoints to create chat completions, text completions, embeddings, and list available models. Requires API key authentication.\n\n- **Admin API** (`/admin/v1/*`) - RESTful management endpoints for multi-tenant configuration. Manage organizations, projects, users, API keys, dynamic providers, usage tracking, and model pricing.\n\n## Authentication\n\nThe gateway supports multiple authentication methods for API access.\n\n### API Key Authentication\n\nAPI keys are the primary authentication method for programmatic access. Keys are created via the Admin API and scoped to organizations, projects, or users.\n\n**Using the Authorization header (recommended):**\n```\nAuthorization: Bearer gw_live_abc123def456...\n```\n\n**Using the X-API-Key header:**\n```\nX-API-Key: gw_live_abc123def456...\n```\n\nBoth headers are supported. The `Authorization: Bearer` format is recommended for compatibility with OpenAI client libraries.\n\n**Example request:**\n```bash\ncurl https://gateway.example.com/api/v1/chat/completions \\\n -H \\\"Authorization: Bearer gw_live_abc123def456...\\\" \\\n -H \\\"Content-Type: application/json\\\" \\\n -d '{\\\"model\\\": \\\"openai/gpt-4\\\", \\\"messages\\\": [{\\\"role\\\": \\\"user\\\", \\\"content\\\": \\\"Hello\\\"}]}'\n```\n\n### JWT Authentication\n\nWhen JWT authentication is enabled, requests can be authenticated using a JWT token from your identity provider.\n\n```\nAuthorization: Bearer eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9...\n```\n\nThe gateway validates the JWT against the configured JWKS endpoint and extracts the identity from the token claims.\n\n**Example request:**\n```bash\ncurl https://gateway.example.com/api/v1/chat/completions \\\n -H \\\"Authorization: Bearer eyJhbGciOiJSUzI1NiIs...\\\" \\\n -H \\\"Content-Type: application/json\\\" \\\n -d '{\\\"model\\\": \\\"openai/gpt-4\\\", \\\"messages\\\": [{\\\"role\\\": \\\"user\\\", \\\"content\\\": \\\"Hello\\\"}]}'\n```\n\n### Multi-Auth Mode\n\nWhen configured for multi-auth, the gateway accepts both API keys and JWTs using **format-based detection**:\n\n- **X-API-Key header**: Always validated as an API key\n- **Authorization: Bearer header**: Uses format-based detection:\n - Tokens starting with the configured API key prefix (default: `gw_`) are validated as API keys\n - All other tokens are validated as JWTs\n\n**Important:** Providing both `X-API-Key` and `Authorization` headers simultaneously results in a 400 error (ambiguous credentials). Choose one authentication method per request.\n\n**Examples:**\n```bash\n# API key in X-API-Key header\ncurl -H \\\"X-API-Key: gw_live_abc123...\\\" https://gateway.example.com/v1/chat/completions\n\n# API key in Authorization: Bearer header (format-based detection)\ncurl -H \\\"Authorization: Bearer gw_live_abc123...\\\" https://gateway.example.com/v1/chat/completions\n\n# JWT in Authorization: Bearer header\ncurl -H \\\"Authorization: Bearer eyJhbGciOiJSUzI1NiIs...\\\" https://gateway.example.com/v1/chat/completions\n```\n\n### Authentication Errors\n\n| Error Code | HTTP Status | Description | Example Response |\n|------------|-------------|-------------|------------------|\n| `unauthorized` | 401 | No authentication credentials provided | `{\\\"error\\\": {\\\"code\\\": \\\"unauthorized\\\", \\\"message\\\": \\\"Authentication required\\\"}}` |\n| `ambiguous_credentials` | 400 | Both X-API-Key and Authorization headers provided | `{\\\"error\\\": {\\\"code\\\": \\\"ambiguous_credentials\\\", \\\"message\\\": \\\"Ambiguous credentials: provide either X-API-Key or Authorization header, not both\\\"}}` |\n| `invalid_api_key` | 401 | API key is invalid, malformed, or revoked | `{\\\"error\\\": {\\\"code\\\": \\\"invalid_api_key\\\", \\\"message\\\": \\\"Invalid API key\\\"}}` |\n| `not_authenticated` | 401 | JWT validation failed | `{\\\"error\\\": {\\\"code\\\": \\\"not_authenticated\\\", \\\"message\\\": \\\"Token validation failed\\\"}}` |\n| `forbidden` | 403 | Valid credentials but insufficient permissions | `{\\\"error\\\": {\\\"code\\\": \\\"forbidden\\\", \\\"message\\\": \\\"Insufficient permissions\\\"}}` |\n\n### Configuration Examples\n\n**API Key Authentication:**\n```toml\n[auth.mode]\ntype = \\\"api_key\\\"\n\n[auth.api_key]\nheader_name = \\\"X-API-Key\\\" # Header to read API key from\nkey_prefix = \\\"gw_\\\" # Valid key prefix\ncache_ttl_secs = 60 # Cache key lookups for 60 seconds\n```\n\n**IdP Authentication (SSO + API keys + JWT):**\n```toml\n[auth.mode]\ntype = \\\"idp\\\"\n\n[auth.api_key]\nheader_name = \\\"X-API-Key\\\"\nkey_prefix = \\\"gw_\\\"\n\n[auth.session]\nsecure = true\n```\n\n**Identity-Aware Proxy (IAP):**\n```toml\n[auth.mode]\ntype = \\\"iap\\\"\nidentity_header = \\\"X-Forwarded-User\\\"\nemail_header = \\\"X-Forwarded-Email\\\"\n```\n\n## Pagination\n\nAll Admin API list endpoints use **cursor-based pagination** for stable, performant navigation.\n\n**Query Parameters:**\n- `limit` (optional): Maximum records per page (default: 100, max: 1000)\n- `cursor` (optional): Opaque cursor from previous response's `next_cursor` or `prev_cursor`\n- `direction` (optional): `forward` (default) or `backward`\n\n**Response:**\n```json\n{\n \\\"data\\\": [...],\n \\\"pagination\\\": {\n \\\"limit\\\": 100,\n \\\"has_more\\\": true,\n \\\"next_cursor\\\": \\\"MTczMzU4MDgwMDAwMDphYmMxMjM0...\\\",\n \\\"prev_cursor\\\": null\n }\n}\n```\n\n## Model Routing\n\nModels can be addressed in several ways:\n\n- **Static routing**: `provider-name/model-name` routes to config-defined providers\n- **Dynamic routing**: `:org/{ORG}/{PROVIDER}/{MODEL}` routes to database-backed providers\n- **Default**: When no prefix is specified, routes to the default provider\n\n## Error Codes\n\nAll errors follow a consistent JSON format:\n\n```json\n{\n \\\"error\\\": {\n \\\"code\\\": \\\"error_code\\\",\n \\\"message\\\": \\\"Human-readable error message\\\",\n \\\"details\\\": { ... } // Optional additional context\n }\n}\n```\n\n### Authentication & Authorization Errors\n\n| Code | HTTP Status | Description |\n|------|-------------|-------------|\n| `unauthorized` | 401 | Missing or invalid API key/token |\n| `invalid_api_key` | 401 | API key is invalid, expired, or revoked |\n| `forbidden` | 403 | Valid credentials but insufficient permissions |\n| `not_authenticated` | 401 | Authentication required for this operation |\n\n### Rate Limiting & Budget Errors\n\n| Code | HTTP Status | Description |\n|------|-------------|-------------|\n| `rate_limit_exceeded` | 429 | Request rate limit exceeded. Check `Retry-After` header. |\n| `budget_exceeded` | 402 | Budget limit exceeded for the configured period. Details include `limit_cents`, `current_spend_cents`, and `period`. |\n| `cache_required` | 503 | Budget enforcement requires cache to be configured |\n\n### Request Validation Errors\n\n| Code | HTTP Status | Description |\n|------|-------------|-------------|\n| `validation_error` | 400 | Request body validation failed |\n| `bad_request` | 400 | Malformed request |\n| `routing_error` | 400 | Model routing failed (invalid model string or provider not found) |\n| `not_found` | 404 | Requested resource not found |\n| `conflict` | 409 | Resource already exists or conflicts with existing state |\n\n### Provider & Gateway Errors\n\n| Code | HTTP Status | Description |\n|------|-------------|-------------|\n| `provider_error` | 502 | Upstream LLM provider returned an error |\n| `request_failed` | 502 | Failed to communicate with upstream provider |\n| `circuit_breaker_open` | 503 | Provider circuit breaker is open due to repeated failures |\n| `response_read_error` | 500 | Failed to read provider response |\n| `response_builder` | 500 | Failed to build response from provider data |\n| `internal_error` | 500 | Internal server error |\n\n### Guardrails Errors\n\n| Code | HTTP Status | Description |\n|------|-------------|-------------|\n| `guardrails_blocked` | 400 | Content blocked by guardrails policy. Response includes `violations` array. |\n| `guardrails_timeout` | 504 | Guardrails evaluation timed out |\n| `guardrails_provider_error` | 502 | Error communicating with guardrails provider |\n| `guardrails_auth_error` | 502 | Authentication failed with guardrails provider |\n| `guardrails_rate_limited` | 429 | Guardrails provider rate limit exceeded |\n| `guardrails_config_error` | 500 | Invalid guardrails configuration |\n| `guardrails_parse_error` | 400 | Failed to parse content for guardrails evaluation |\n\n### Admin API Errors\n\n| Code | HTTP Status | Description |\n|------|-------------|-------------|\n| `database_required` | 503 | Database not configured (required for admin operations) |\n| `services_required` | 503 | Required services not initialized |\n| `not_configured` | 503 | Required feature or service not configured |\n| `database_error` | 500 | Database operation failed |\n\n## Rate Limiting\n\nThe gateway implements multiple layers of rate limiting to protect against abuse and ensure fair usage.\n\n### Rate Limit Types\n\n| Type | Scope | Default | Description |\n|------|-------|---------|-------------|\n| **Requests per minute** | API Key | 60 | Maximum requests per minute per API key |\n| **Requests per day** | API Key | Unlimited | Optional daily request limit per API key |\n| **Tokens per minute** | API Key | 100,000 | Maximum tokens processed per minute |\n| **Tokens per day** | API Key | Unlimited | Optional daily token limit |\n| **Concurrent requests** | API Key | 10 | Maximum simultaneous in-flight requests |\n| **IP requests per minute** | IP Address | 120 | Rate limit for unauthenticated requests |\n\n### Rate Limit Headers\n\nAll API responses include rate limit information in HTTP headers.\n\n#### Request Rate Limit Headers\n\n| Header | Description | Example |\n|--------|-------------|---------|\n| `X-RateLimit-Limit` | Maximum requests allowed in the current window | `60` |\n| `X-RateLimit-Remaining` | Requests remaining in the current window | `45` |\n| `X-RateLimit-Reset` | Seconds until the rate limit window resets | `42` |\n\n#### Token Rate Limit Headers\n\n| Header | Description | Example |\n|--------|-------------|---------|\n| `X-TokenRateLimit-Limit` | Maximum tokens allowed per minute | `100000` |\n| `X-TokenRateLimit-Remaining` | Tokens remaining in the current minute | `85000` |\n| `X-TokenRateLimit-Used` | Tokens used in the current minute | `15000` |\n| `X-TokenRateLimit-Day-Limit` | Maximum tokens allowed per day (if configured) | `1000000` |\n| `X-TokenRateLimit-Day-Remaining` | Tokens remaining today (if configured) | `950000` |\n\n#### Rate Limit Exceeded Response\n\nWhen a rate limit is exceeded, the API returns HTTP 429 with:\n\n```json\n{\n \\\"error\\\": {\n \\\"code\\\": \\\"rate_limit_exceeded\\\",\n \\\"message\\\": \\\"Rate limit exceeded: 60 requests per minute\\\",\n \\\"details\\\": {\n \\\"limit\\\": 60,\n \\\"window\\\": \\\"minute\\\",\n \\\"retry_after_secs\\\": 42\n }\n }\n}\n```\n\nThe `Retry-After` header indicates seconds to wait before retrying:\n\n```\nHTTP/1.1 429 Too Many Requests\nRetry-After: 42\nX-RateLimit-Limit: 60\nX-RateLimit-Remaining: 0\nX-RateLimit-Reset: 42\n```\n\n### IP-Based Rate Limiting\n\nUnauthenticated requests (requests without a valid API key) are rate limited by IP address. This protects public endpoints like `/health` from abuse.\n\n- **Default:** 120 requests per minute per IP\n- **Client IP Detection:** Respects `X-Forwarded-For` and `X-Real-IP` headers when trusted proxies are configured\n- **Configuration:** Can be disabled or adjusted via `limits.rate_limits.ip_rate_limits` in config\n\n### Rate Limit Configuration\n\nRate limits are configured hierarchically:\n\n1. **Global defaults** (in `hadrian.toml`):\n```toml\n[limits.rate_limits]\nrequests_per_minute = 60\ntokens_per_minute = 100000\nconcurrent_requests = 10\n\n[limits.rate_limits.ip_rate_limits]\nenabled = true\nrequests_per_minute = 120\n```\n\n2. **Per-API key** limits can override global defaults (when creating API keys via Admin API)\n\n### Best Practices\n\n- **Implement exponential backoff**: When receiving 429 responses, wait the `Retry-After` duration before retrying\n- **Monitor rate limit headers**: Track `X-RateLimit-Remaining` to proactively throttle requests\n- **Use streaming for long responses**: Streaming responses don't hold connections during generation\n- **Batch requests when possible**: Combine multiple small requests into larger batches\n", |
There was a problem hiding this comment.
The OpenAPI description still states that the Public API “Requires API key authentication”, but this PR introduces multiple auth modes (none, idp, iap) where API keys may be optional or JWT/session/proxy headers can authenticate. The description should be updated to reflect the new auth.mode behavior so client developers don’t get misleading guidance.
src/middleware/combined.rs
Outdated
| AuthMode::None => { | ||
| // Optional auth: try API key if header present, don't require it | ||
| let api_key = try_api_key_auth(headers, state).await?; | ||
| let identity = try_identity_auth(headers, connecting_ip, state).await?; | ||
| let kind = match (api_key, identity) { | ||
| (Some(api_key), Some(identity)) => IdentityKind::Both { | ||
| api_key: Box::new(api_key), | ||
| identity, | ||
| }, | ||
| (Some(api_key), None) => IdentityKind::ApiKey(api_key), | ||
| (None, Some(identity)) => IdentityKind::Identity(identity), | ||
| (None, None) => return Err(AuthError::MissingCredentials), | ||
| }; |
There was a problem hiding this comment.
In try_authenticate, the AuthMode::None branch is documented as “optional auth (…don’t require it)”, but it returns Err(MissingCredentials) when no credentials are present. This makes it easy for callers to accidentally treat none mode as requiring auth (or to implement brittle “ignore MissingCredentials” logic). Consider changing the return type to Result<Option<AuthenticatedRequest>, AuthError> (return Ok(None) for truly unauthenticated requests), or at least aligning the docs/behavior so AuthMode::None doesn’t signal an error for the no-credentials case.
Greptile SummaryThis PR consolidates authentication from separate gateway and admin configurations into a unified mode system with four clear options: Key Changes
Testing Recommendations
Confidence Score: 4/5
Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
Start[Incoming Request] --> CheckMode{Auth Mode?}
CheckMode -->|None| CheckCreds{Has Credentials?}
CheckCreds -->|No| AllowAnon[Allow Anonymous]
CheckCreds -->|Yes| ValidateOptional[Validate API Key]
ValidateOptional -->|Valid| AuthSuccess[Authenticated]
ValidateOptional -->|Invalid| AuthFail[401 Unauthorized]
CheckMode -->|ApiKey| ValidateKey[Validate API Key Required]
ValidateKey -->|Valid| AuthSuccess
ValidateKey -->|Invalid/Missing| AuthFail
CheckMode -->|Idp| CheckDual{Both X-API-Key<br/>and Authorization?}
CheckDual -->|Yes| Ambiguous[400 Ambiguous Credentials]
CheckDual -->|No| TrySession[Try Session Cookie]
TrySession -->|Valid| AuthSuccess
TrySession -->|None| TryApiKeyOrJWT[Try API Key or JWT]
TryApiKeyOrJWT -->|API Key Valid| AuthSuccess
TryApiKeyOrJWT -->|JWT Valid| AuthSuccess
TryApiKeyOrJWT -->|None/Invalid| AuthFail
CheckMode -->|Iap| CheckProxy{From Trusted<br/>Proxy?}
CheckProxy -->|No| ProxyFail[403 Forbidden]
CheckProxy -->|Yes| TryIapHeaders[Try Identity Headers]
TryIapHeaders -->|Valid| AuthSuccess
TryIapHeaders -->|None| TryIapKey[Try API Key]
TryIapKey -->|Valid| AuthSuccess
TryIapKey -->|None/Invalid| AuthFail
Last reviewed commit: 68cccf9 |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 55 out of 56 changed files in this pull request and generated 4 comments.
Files not reviewed (1)
- ui/pnpm-lock.yaml: Language not supported
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Link-local (169.254.0.0/16) — blocked unless allow_private | ||
| // (cloud metadata 169.254.169.254 is always blocked above) | ||
| if v4.is_link_local() { | ||
| return true; | ||
| return !opts.allow_private; | ||
| } | ||
| // Cloud metadata endpoint (169.254.169.254) | ||
| if v4 == Ipv4Addr::new(169, 254, 169, 254) { | ||
| return true; // Always block, even if allow_loopback | ||
| // Private ranges (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16) | ||
| if v4.is_private() { | ||
| return !opts.allow_private; |
There was a problem hiding this comment.
allow_private currently also permits IPv4 link-local addresses (169.254.0.0/16) except for the single metadata IP. Link-local is commonly blocked in SSRF defenses because it can still reach sensitive host/network services; folding it into allow_private makes it easy to over-relax SSRF protections unintentionally. Consider keeping link-local blocked by default even when allow_private is true, or introducing a separate allow_link_local flag with explicit docs and config naming.
| let has_credentials = headers | ||
| .contains_key(state.config.auth.api_key_config().header_name.as_str()) | ||
| || headers.contains_key(axum::http::header::AUTHORIZATION); | ||
| let auth_result = if !state.config.auth.is_auth_enabled() && !has_credentials { | ||
| Err(AuthError::MissingCredentials) | ||
| } else { | ||
| try_authenticate(&headers, cookies.as_ref(), connecting_ip, &state).await |
There was a problem hiding this comment.
has_credentials checks the configured API key header name, but the later “credentials were provided but invalid” branch only checks literal "X-API-Key" / "Authorization". If a deployment uses a custom header (e.g. "Api-Key"), invalid credentials can be treated as “no credentials” and the request will incorrectly proceed anonymously. Use the configured header name consistently (and/or reuse has_credentials) when deciding whether to reject vs allow anonymous access.
| type = "none" | ||
|
|
There was a problem hiding this comment.
This production configuration sets auth.mode.type to "none", which disables all authentication for both API and admin endpoints. If an operator uses this file as-is, the gateway will run in production with completely unauthenticated access, allowing any unauthenticated user to invoke models and access admin functionality. Change auth.mode.type to a secure mode such as "api_key", "idp", or "iap" by default, and reserve "none" strictly for clearly-labeled development-only configs.
| type = "none" | |
| type = "api_key" | |
| # Note: Do NOT use auth.mode.type = "none" in production. Reserve it for clearly-marked | |
| # development-only configs if you need to disable authentication locally. |
| key_prefix = "gw_" | ||
| cache_ttl_secs = 300 | ||
| [auth.mode] | ||
| type = "none" |
There was a problem hiding this comment.
This PostgreSQL HA production configuration sets auth.mode.type to "none", fully disabling authentication. Deploying this file as-is would expose all gateway and admin endpoints without any access control, enabling unauthorized access to data and model operations. Update the default to a secure mode like "api_key", "idp", or "iap", and ensure "none" is only used in explicitly development/test configurations.
| type = "none" | |
| type = "api_key" |
Re-work the authentication to combine gateway and control plane endpoints, reduce options to 4:
idisyncracies