Adjust/Improve pipeline to be compatible with more models#4
Adjust/Improve pipeline to be compatible with more models#4yuqiannemo wants to merge 13 commits intoXtra-Computing:devfrom
Conversation
|
Thanks for working on broader model compatibility here. Supporting more HF model families is a good direction, and the That said, I don't think this is ready to merge as-is because the dependency changes would make the base package less portable. Moving these packages into
I think the right direction is:
On openrouter_prefixes = [
"openrouter/",
"openrouter:",
"anthropic/claude-",
"deepseek/",
"openai/gpt-3",
"openai/gpt-4",
"google/gemini-",
"x-ai/grok-",
"cohere/command",
"perplexity/",
]This way One smaller process note: the PR description looks incomplete ( Net: I like the goal of the PR, but the dependency strategy needs to change before this can be merged. |
Fixed |
JerryLife
left a comment
There was a problem hiding this comment.
Thanks for the PR! The optional extras and sensitive info protection are great additions. A few issues to address before this is merge-ready:
Correctness
-
Hardcoded relative paths will break when installed as a package
ModelLoader.pyusesPath(__file__).parents[3] / "configs" / "openrouter_llm_list.jsonl"— this assumes a source-tree layout. When installed viapip install llm-dna, theconfigs/directory won't exist at that relative path. Consider usingimportlib.resourcesor including the file as package data.- Same issue in
cli.pywithPath(__file__).parents[2] / ".env"— this path won't resolve correctly in an installed package. The.envloading should probably just useload_dotenv()(which searches CWD and parent dirs) without the hardcoded project root path.
-
Missing
openrouter_llm_list.jsonl- The diff references
configs/openrouter_llm_list.jsonlbut the file doesn't appear in the diff. Is it committed on the branch? If so, it also needs to be included as package data inpyproject.tomlto be available at install time.
- The diff references
-
Redundant inference in single-model response caching (
api.py)- The new block at L645-672 calls
_generate_responses_for_model()again even though inference was already performed above for DNA extraction. This effectively doubles the computation cost for single-model mode. The responses from the earlier inference step should be reused instead of regenerated.
- The new block at L645-672 calls
Improvements
-
[full]extras should reference sub-extras
Instead of duplicating all packages:full = [ "llm-dna[apple]", "llm-dna[quantization]", "llm-dna[model_families]", ]
This avoids needing to keep version pins in sync across two places.
-
Platform-specific markers
mlx/mlx-lmonly work on macOS Apple Silicon, andmamba-ssmrequires CUDA. Consider adding environment markers:apple = [ "mlx>=0.10.0; sys_platform == 'darwin'", ]
-
No tests for new model detection logic
The OpenRouter model detection via JSONL lookup is a significant behavioral change — a unit test would help prevent regressions.
Overall the direction is solid. Happy to help if you have questions on any of these points!
Uh oh!
There was an error while loading. Please reload this page.