-
-
Notifications
You must be signed in to change notification settings - Fork 17
implementation(PR-98): model aliases #99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughAdds model-alias support: config field and docs, an AliasResolver to map aliases to actual models and discover endpoints, propagation of endpoint→model maps in request context, and request-body rewriting of the top-level "model" field before proxying to backend endpoints. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Handler as Application/Handler
participant Resolver as AliasResolver
participant Proxy as ProxyService
participant Rewriter as core.RewriteModelForAlias
participant Backend as ModelEndpoint
Client->>Handler: POST /v1/chat/completions (model: alias)
Handler->>Resolver: ResolveEndpoints(alias)
Resolver-->>Handler: map(endpointURL -> actualModel)
Handler->>Proxy: forward request (with alias-map in context)
Proxy->>Rewriter: RewriteModelForAlias(ctx, req, endpoint)
rect rgba(200,230,255,0.5)
Rewriter-->>Proxy: modified request body (model: actualModel)
end
Proxy->>Backend: proxied request with actualModel
Backend-->>Proxy: response
Proxy-->>Client: response
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested labels
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@internal/adapter/proxy/core/model_rewrite_test.go`:
- Around line 247-267: The test TestRewriteModelForAlias_BodyRereadable
overpromises by only reading the request body once after calling
RewriteModelForAlias; either rename the test to indicate it only verifies the
body is readable after rewrite (e.g.,
TestRewriteModelForAlias_BodyReadableAfterRewrite) or actually perform a second
read by resetting the body position (use the bytes.Reader created in
RewriteModelForAlias via io.NopCloser(bytes.NewReader(...)) — obtain a new
bytes.Reader or type-assert to enable Seek or re-wrap the same bytes into a new
io.NopCloser before the second read) so the test truly verifies multiple reads.
Ensure you update the test name or add the second read and assertions
accordingly while referencing TestRewriteModelForAlias_BodyRereadable and
RewriteModelForAlias.
🧹 Nitpick comments (4)
internal/config/types.go (1)
24-35: Consider adding validation forModelAliasesentries.Currently there's no config-level validation — an alias with an empty model list (e.g.,
gpt-oss-120b: []) or an empty alias key would silently pass through. AValidate()method onConfig(or within the existing validation flow) could catch misconfigurations early and provide clear error messages at startup.internal/app/handlers/handler_proxy.go (1)
406-455: Alias resolution routing logic looks correct and well-structured.The fallback to standard routing when alias resolution yields no endpoints (lines 415-429) is a sensible resilience pattern. The method correctly filters candidates against resolved endpoints and stores the rewrite map for downstream use. Comments use appropriate Australian English ("recognises").
One minor note: on line 416, when
erris nil butendpointToModelis empty, the log message includes"error", errwhich will emit"error", <nil>. This is cosmetic but slightly noisy.Optional: omit error from log when nil
- if err != nil || len(endpointToModel) == 0 { - logger.Warn("Model alias resolved to no endpoints, falling back to standard routing", - "alias", aliasName, - "error", err) + if err != nil || len(endpointToModel) == 0 { + logFields := []any{"alias", aliasName} + if err != nil { + logFields = append(logFields, "error", err) + } + logger.Warn("Model alias resolved to no endpoints, falling back to standard routing", logFields...)internal/adapter/proxy/core/model_rewrite.go (1)
56-83: JSON re-serialisation reorders keys and strips formatting.
rewriteModelFieldunmarshals intomap[string]json.RawMessageand re-marshals, which produces compact JSON with keys in an undefined order. This is functionally correct for the model field rewrite, but it does alter the byte-level representation of the entire request body. If any downstream component or backend is sensitive to key ordering or whitespace, this could cause subtle issues.This is likely acceptable given that all consumers are JSON parsers, but worth noting.
internal/adapter/registry/alias.go (1)
13-19: Remove the unusedreverseIndexfield fromAliasResolver.The
reverseIndexfield is populated in the constructor but never read by any method in the type. None of the methods (IsAlias,GetActualModels,ResolveEndpoints) reference it—they all use only thealiasesmap. Removing this field will reduce the struct size and eliminate dead code that serves no purpose.
| // Ensure the http.Request's Body can be read multiple times after rewrite | ||
| func TestRewriteModelForAlias_BodyRereadable(t *testing.T) { | ||
| aliasMap := map[string]string{ | ||
| "http://ollama:11434": "rewritten-model", | ||
| } | ||
| ctx := context.WithValue(context.Background(), constants.ContextModelAliasMapKey, aliasMap) | ||
| body := `{"model": "original-model"}` | ||
| r, _ := http.NewRequestWithContext(ctx, "POST", "/v1/chat/completions", bytes.NewBufferString(body)) | ||
| r.ContentLength = int64(len(body)) | ||
|
|
||
| endpoint := &domain.Endpoint{URLString: "http://ollama:11434"} | ||
| RewriteModelForAlias(ctx, r, endpoint) | ||
|
|
||
| // Read body first time | ||
| body1, err := io.ReadAll(r.Body) | ||
| if err != nil { | ||
| t.Fatalf("first read failed: %v", err) | ||
| } | ||
| if len(body1) == 0 { | ||
| t.Error("first read should return non-empty body") | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test name overpromises — body is only read once.
TestRewriteModelForAlias_BodyRereadable claims to verify the body can be read multiple times, but it only performs a single read. Since RewriteModelForAlias sets the body to io.NopCloser(bytes.NewReader(...)), a bytes.Reader does support Seek, but it's not exposed through the io.ReadCloser interface. Either rename the test to reflect what it actually checks (e.g., _BodyReadableAfterRewrite) or add a second read after seeking.
🤖 Prompt for AI Agents
In `@internal/adapter/proxy/core/model_rewrite_test.go` around lines 247 - 267,
The test TestRewriteModelForAlias_BodyRereadable overpromises by only reading
the request body once after calling RewriteModelForAlias; either rename the test
to indicate it only verifies the body is readable after rewrite (e.g.,
TestRewriteModelForAlias_BodyReadableAfterRewrite) or actually perform a second
read by resetting the body position (use the bytes.Reader created in
RewriteModelForAlias via io.NopCloser(bytes.NewReader(...)) — obtain a new
bytes.Reader or type-assert to enable Seek or re-wrap the same bytes into a new
io.NopCloser before the second read) so the test truly verifies multiple reads.
Ensure you update the test name or add the second read and assertions
accordingly while referencing TestRewriteModelForAlias_BodyRereadable and
RewriteModelForAlias.
- Rename BodyRereadable test to BodyReadableAfterRewrite (test only reads once) - Remove unused reverseIndex field from AliasResolver - Omit nil error from log when alias resolves to no endpoints
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@internal/app/handlers/handler_proxy.go`:
- Around line 406-458: The alias-success path in resolveAliasEndpoints doesn't
set profile.RoutingDecision, so downstream code (executeProxyRequest,
logRequestResult) lacks routing metadata; before returning aliasEndpoints,
synthesize and assign a lightweight domain.RoutingDecision to
profile.RoutingDecision (e.g., populate the selected endpoint/URL, a strategy or
reason like "alias", and any minimal fields used by logging/metrics), reusing
domain.RoutingDecision type and keys (and keep
profile.SetInspectionMeta(endpointToModel) as-is) so observability consumers
receive consistent routing info for alias-routed requests.
- Around line 435-440: The lookup inconsistency can cause silent mismatches:
update all map key uses and comparisons to consistently use the getter rather
than direct field access—replace any direct accesses to endpoint.URLString with
endpoint.GetURLString() (including where endpointToModel is populated and where
it's queried, e.g., the aliasEndpoints loop and the other occurrence around the
earlier reported URLString use) so both map keys and lookups use the same
normalized value provided by GetURLString().
🧹 Nitpick comments (1)
internal/adapter/proxy/core/model_rewrite_test.go (1)
154-199: Consider adding a table case where the new model name contains characters that need JSON escaping.Model names with characters like
"or\would exercise whetherrewriteModelFieldproduces valid JSON after substitution. Not blocking, just a hardening suggestion.
| var aliasEndpoints []*domain.Endpoint | ||
| for _, endpoint := range candidates { | ||
| if _, ok := endpointToModel[endpoint.GetURLString()]; ok { | ||
| aliasEndpoints = append(aliasEndpoints, endpoint) | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check if GetURLString() simply returns URLString or does something else
ast-grep --pattern $'func ($_ *Endpoint) GetURLString() string {
$$$
}'Repository: thushan/olla
Length of output: 228
🏁 Script executed:
#!/bin/bash
# Find ResolveEndpoints function and how endpointToModel is built
rg -A 30 "ResolveEndpoints" internal/app/handlers/handler_proxy.go | head -80Repository: thushan/olla
Length of output: 1170
🏁 Script executed:
#!/bin/bash
# Find the ResolveEndpoints method implementation
rg -B 5 "func.*ResolveEndpoints.*{" --type go | head -50Repository: thushan/olla
Length of output: 169
🏁 Script executed:
#!/bin/bash
# Find ResolveEndpoints implementation
rg "func.*ResolveEndpoints" --type go -A 20 | head -100Repository: thushan/olla
Length of output: 5997
🏁 Script executed:
#!/bin/bash
# Get full ResolveEndpoints implementation
sed -n '1,500p' internal/adapter/registry/alias.go | grep -A 50 "func.*ResolveEndpoints"Repository: thushan/olla
Length of output: 1209
🏁 Script executed:
#!/bin/bash
# Find GetEndpointsForModel implementation
rg "func.*GetEndpointsForModel" --type go -A 15Repository: thushan/olla
Length of output: 15046
🏁 Script executed:
#!/bin/bash
# Get more context from MemoryModelRegistry
rg -A 30 "func.*MemoryModelRegistry.*GetEndpointsForModel" internal/adapter/registry/memory_registry.goRepository: thushan/olla
Length of output: 890
🏁 Script executed:
#!/bin/bash
# Find RegisterModel implementation
rg -A 40 "func.*MemoryModelRegistry.*RegisterModel" internal/adapter/registry/memory_registry.go | head -60Repository: thushan/olla
Length of output: 2041
🏁 Script executed:
#!/bin/bash
# Get full RegisterModel function
sed -n '/func.*MemoryModelRegistry.*RegisterModel/,/^func/p' internal/adapter/registry/memory_registry.go | head -100Repository: thushan/olla
Length of output: 1859
🏁 Script executed:
#!/bin/bash
# Find calls to RegisterModel that might use endpoint.URLString or GetURLString
rg "RegisterModel.*endpoint\." --type go -B 2 -A 2 | head -50Repository: thushan/olla
Length of output: 919
🏁 Script executed:
#!/bin/bash
# Check line 566 and surrounding context in handler_proxy.go
sed -n '560,575p' internal/app/handlers/handler_proxy.goRepository: thushan/olla
Length of output: 550
🏁 Script executed:
#!/bin/bash
# Find Endpoint struct definition
rg -A 20 "type Endpoint struct" internal/core/domain/endpoint.goRepository: thushan/olla
Length of output: 739
🏁 Script executed:
#!/bin/bash
# Find where URLString is assigned
rg "URLString\s*=" internal/core/domain/endpoint.go -B 2 -A 1Repository: thushan/olla
Length of output: 38
🏁 Script executed:
#!/bin/bash
# Search for Endpoint construction
rg "URLString\s*=" --type go -B 2 -A 1 | grep -A 3 -B 3 "URLString"Repository: thushan/olla
Length of output: 2248
🏁 Script executed:
#!/bin/bash
# Search for Endpoint construction
rg "&domain.Endpoint\{" --type go -A 10 | head -100Repository: thushan/olla
Length of output: 6932
🏁 Script executed:
#!/bin/bash
# Search for modifications to endpoint.URL
rg "endpoint\.URL\s*=" --type go -B 2 -A 1Repository: thushan/olla
Length of output: 459
Standardise URL lookups to prevent silent endpoint mismatches.
The code currently uses both endpoint.GetURLString() and endpoint.URLString inconsistently—some places call the getter method (line 437 here) whilst others access the field directly (line 566). Whilst GetURLString() presently returns the field unchanged, this pattern is fragile. If the getter is later modified to apply normalisation (e.g., trailing slash handling, scheme normalisation), the lookup into endpointToModel would silently miss endpoints. Ensure consistent use of one approach throughout the codebase, preferably via the getter method for maintainability.
🤖 Prompt for AI Agents
In `@internal/app/handlers/handler_proxy.go` around lines 435 - 440, The lookup
inconsistency can cause silent mismatches: update all map key uses and
comparisons to consistently use the getter rather than direct field
access—replace any direct accesses to endpoint.URLString with
endpoint.GetURLString() (including where endpointToModel is populated and where
it's queried, e.g., the aliasEndpoints loop and the other occurrence around the
earlier reported URLString use) so both map keys and lookups use the same
normalized value provided by GetURLString().
- Set RoutingDecision on alias happy path for response headers/metrics - Standardise URL lookups: endpoint.URLString → endpoint.GetURLString()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@docs/content/concepts/model-aliases.md`:
- Around line 193-198: Replace the inaccurate troubleshooting bullet "Endpoint
URL in the alias mapping doesn't match the discovered endpoint" with wording
that reflects alias → model-name mapping (e.g., "The actual model name in the
alias list doesn't exactly match the model name reported by the backend") so
readers understand aliases map names to models rather than endpoint URLs; update
the sentence in the model-aliases.md section where that bullet appears.
- Around line 40-65: The fenced ASCII-art block starting with the line
containing Client request: "model": "my-llama" should include a language
specifier to satisfy markdownlint MD040; update the opening triple-backtick that
begins that diagram to ```text so the block becomes a text-fenced code block
(leave the diagram content unchanged and only modify the opening fence).
🧹 Nitpick comments (1)
internal/app/handlers/handler_proxy.go (1)
362-365: Comment describes "what" rather than "why".The coding guidelines ask comments to explain why rather than what. Consider something like:
// aliases need separate resolution because they map to multiple actual model names across backends.As per coding guidelines,
**/*.go: "Comment on why rather than what in code".✏️ Suggested comment rewording
- // check for model alias before standard routing + // aliases map one name to multiple backend-specific models, so they + // need dedicated resolution rather than the standard single-model path if a.aliasResolver != nil && a.aliasResolver.IsAlias(profile.ModelName) {
- Add language specifier to ASCII-art code block (```text) - Fix inaccurate troubleshooting cause: aliases map model names, not endpoint URLs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@docs/content/concepts/model-aliases.md`:
- Around line 196-197: Update the first bullet text in the list that currently
reads "Actual model name in the alias list doesn't exactly match the model name
reported by the backend" to include the missing article at the start so it reads
"The actual model name in the alias list doesn’t exactly match the model name
reported by the backend"; locate and edit the exact bullet string in
docs/content/concepts/model-aliases.md (the list entry "Actual model name in the
alias list doesn't exactly match the model name reported by the backend") and
replace it with the corrected sentence.
- Around line 32-34: Replace the phrase "the exact same model" with the more
concise "the same model" in the paragraph that reads "Without aliases, a client
request for `llama3.1:8b` would only match the Ollama endpoint — even though the
other backends have the exact same model." Update that sentence so it reads
"...even though the other backends have the same model," preserving the
surrounding wording and inline code `llama3.1:8b`.
| Without aliases, a client request for `llama3.1:8b` would only match the Ollama endpoint — even though the other backends have the exact same model. | ||
|
|
||
| **Model aliases** let you define a single virtual model name that maps to all of these variants, so any backend that has the model can serve the request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tighten wording (“exact same”).
Consider simplifying to “the same model” for concision.
🧰 Tools
🪛 LanguageTool
[style] ~32-~32: ‘exact same’ might be wordy. Consider a shorter alternative.
Context: ...even though the other backends have the exact same model. Model aliases let you defin...
(EN_WORDINESS_PREMIUM_EXACT_SAME)
🤖 Prompt for AI Agents
In `@docs/content/concepts/model-aliases.md` around lines 32 - 34, Replace the
phrase "the exact same model" with the more concise "the same model" in the
paragraph that reads "Without aliases, a client request for `llama3.1:8b` would
only match the Ollama endpoint — even though the other backends have the exact
same model." Update that sentence so it reads "...even though the other backends
have the same model," preserving the surrounding wording and inline code
`llama3.1:8b`.
| - Actual model name in the alias list doesn't exactly match the model name reported by the backend | ||
| - Request body is not JSON |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add missing article for readability.
Suggested: “The actual model name in the alias list doesn’t exactly match the model name reported by the backend.”
🧰 Tools
🪛 LanguageTool
[uncategorized] ~196-~196: Possible missing article found.
Context: ...al model name. Possible Causes: - Actual model name in the alias list doesn't ex...
(AI_HYDRA_LEO_MISSING_THE)
🤖 Prompt for AI Agents
In `@docs/content/concepts/model-aliases.md` around lines 196 - 197, Update the
first bullet text in the list that currently reads "Actual model name in the
alias list doesn't exactly match the model name reported by the backend" to
include the missing article at the start so it reads "The actual model name in
the alias list doesn’t exactly match the model name reported by the backend";
locate and edit the exact bullet string in
docs/content/concepts/model-aliases.md (the list entry "Actual model name in the
alias list doesn't exactly match the model name reported by the backend") and
replace it with the corrected sentence.
|
Will it at some point stop complaining? 😁 |
|
Great work on the PR, haven't had a chance for a full review yet (on holidays till tomorrow!). To make it clean, thinking of putting these in the models.yaml might be better/cleaner, thoughts? I'll review this tomorrow once I have a dev environment, thanks again, I'll make a fix for the CI in main you can rebase on tomorrow too (for |
|
ah, if you run better align on your branch and get it to fix it for you, and push, you'll be fine. Alternatively, there's the |
I'm feeling like the models.yaml is somewhat of an internal mapping config; e.g. I don't use it and only have the config.yaml mapped outside of my docker container. Moving the logic into the models.yaml would mean that everybody that wants to use it needs to copy the existing one from https://github.com/thushan/olla/blob/main/config/models.yaml and adjust it to their specific needs, right? Otherwise they would lose all "default" settings, aren't they?
I'll have a look this evening and get back to you :) |
Feature: Model Aliases
Related Issue
Closes #98
What This Does
Hey @thushan 👋
This PR implements the model alias feature requested in #98. When using multiple backends (Ollama, LM Studio, llamacpp, etc.), the same model often has different names across platforms. Model aliases let users define a single virtual name that maps to all the platform-specific variants, so any backend that has the model can serve the request — regardless of naming differences.
Example Configuration
In
config.yaml:When a request arrives with
"model": "gpt-oss-120b", olla will:"model"field in the request body to match what the selected backend actually expects (e.g.gpt-oss:120bfor Ollama)Self-referencing aliases work correctly
If the alias name is also a real model name on some backend, it still works:
The alias always takes priority during routing. The self-referencing entry ensures the endpoint serving
gpt-oss-120bnatively is included in the candidate pool, and the request body keepsgpt-oss-120bfor that backend (no unnecessary rewrite).What's NOT included
Per the discussion in the issue, the
only_alias: truesetting (to hide non-aliased models from/v1/models) is not part of this PR. Happy to add it as a follow-up if desired.How It Works (Architecture)
The implementation follows olla's hexagonal architecture and touches minimal surface area:
Config.ModelAliasesmap[string][]stringfield in config, parsed frommodel_aliasesin YAMLAliasResolverhandler_proxy.gofilterEndpointsByProfile), passes the rewrite map through contextcore.RewriteModelForAlias()"model"field in the JSON body right before the request is sent to the selected backendThe alias resolution hooks into the existing model routing pipeline without changing the
ModelRegistryinterface or any existing routing strategies.Files Changed
New files:
internal/adapter/registry/alias.go— AliasResolverinternal/adapter/registry/alias_test.go— Testsinternal/adapter/proxy/core/model_rewrite.go— Request body model rewritinginternal/adapter/proxy/core/model_rewrite_test.go— TestsModified files:
internal/config/types.go— AddedModelAliasesfield toConfiginternal/config/config_test.go— Config parsing tests for aliasesinternal/core/constants/context.go— Context key for alias rewrite mapinternal/app/handlers/application.go— Wires up the AliasResolverinternal/app/handlers/handler_proxy.go— Alias-aware endpoint filtering + context passinginternal/adapter/proxy/olla/service.go— Model rewrite call (legacy path)internal/adapter/proxy/olla/service_retry.go— Model rewrite call (retry path)internal/adapter/proxy/sherpa/service_retry.go— Model rewrite callconfig/config.yaml— Documented example (commented out)config.yaml— SameTesting
go test ./internal/...)model_aliasesgo build ./...,go vet ./internal/...)Summary by CodeRabbit
New Features
Documentation
Tests