Skip to content

feat(backend): add Nvidia NIM as a first-class backend provider#962

Merged
Henry-811 merged 1 commit intodev/v0.1.58from
feature/nvidia-nim-backend
Mar 2, 2026
Merged

feat(backend): add Nvidia NIM as a first-class backend provider#962
Henry-811 merged 1 commit intodev/v0.1.58from
feature/nvidia-nim-backend

Conversation

@AbhimanyuAryan
Copy link
Copy Markdown
Collaborator

@AbhimanyuAryan AbhimanyuAryan commented Mar 1, 2026

PR Title Format

Your PR title must follow the format: <type>: <brief description>

Valid types:

  • fix: - Bug fixes
  • feat: - New features
  • breaking: - Breaking changes
  • docs: - Documentation updates
  • refactor: - Code refactoring
  • test: - Test additions/modifications
  • chore: - Maintenance tasks
  • perf: - Performance improvements
  • style: - Code style changes
  • ci: - CI/CD configuration changes

Examples:

  • fix: resolve memory leak in data processing
  • feat: add export to CSV functionality
  • breaking: change API response format
  • docs: update installation guide

Description

Brief description of the changes in this PR

Type of change

  • Bug fix (fix:) - Non-breaking change which fixes an issue
  • New feature (feat:) - Non-breaking change which adds functionality
  • Breaking change (breaking:) - Fix or feature that would cause existing functionality to not work as expected
  • Documentation (docs:) - Documentation updates
  • Code refactoring (refactor:) - Code changes that neither fix a bug nor add a feature
  • Tests (test:) - Adding missing tests or correcting existing tests
  • Chore (chore:) - Maintenance tasks, dependency updates, etc.
  • Performance improvement (perf:) - Code changes that improve performance
  • Code style (style:) - Changes that do not affect the meaning of the code (formatting, missing semi-colons, etc.)
  • CI/CD (ci:) - Changes to CI/CD configuration files and scripts

Checklist

  • I have run pre-commit on my changed files and all checks pass
  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

Pre-commit status

# Paste the output of running pre-commit on your changed files:
# uv run pre-commit install
# git diff --name-only HEAD~1 | xargs uv run pre-commit run --files # for last commit
# git diff --name-only origin/<base branch>...HEAD | xargs uv run pre-commit run --files # for all commits in PR
# git add <your file> # if any fixes were applied
# git commit -m "chore: apply pre-commit fixes"
# git push origin <branch-name>

How to Test

Add test method for this PR.

Test CLI Command

Write down the test bash command. If there is pre-requests, please emphasize.

Expected Results

Description/screenshots of expected results.

Additional context

Add any other context about the PR here.

Summary by CodeRabbit

Release Notes

  • New Features
    • Added Nvidia NIM (NVIDIA Inference Microservices) as a supported backend provider
    • Integrated multiple LLM models from Nvidia NIM including Moonshotai Kimi K2.5, DeepSeek-R1, Llama, and Mistral variants
    • Added NGC_API_KEY environment variable support for Nvidia NIM authentication
    • Enabled auto-detection of Nvidia NIM in configuration workflows

@AbhimanyuAryan AbhimanyuAryan marked this pull request as ready for review March 1, 2026 15:28
Copilot AI review requested due to automatic review settings March 1, 2026 15:28
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 1, 2026

📝 Walkthrough

Walkthrough

The PR adds support for Nvidia NIM as a new chat completion backend provider across the codebase. Changes include backend capability registration, provider detection logic, CLI configuration, model catalog integration, model matcher updates, and documentation additions.

Changes

Cohort / File(s) Summary
Backend Integration
massgen/backend/capabilities.py, massgen/backend/chat_completions.py
Added Nvidia NIM to backend capabilities registry with supported models and NGC_API_KEY configuration. Updated provider detection to recognize "nvidia.com" URLs as Nvidia NIM provider.
Configuration & CLI
massgen/cli.py, massgen/config_builder.py
Extended CLI backend creation to support Nvidia NIM with NGC_API_KEY handling and base URL configuration. Added "nvidia_nim" to chatcompletion providers list and API key setup UI.
Model Support
massgen/utils/model_catalog.py, massgen/utils/model_matcher.py
Added model fetching capability for Nvidia NIM from OpenAI-compatible endpoint. Added curated model list for Nvidia NIM including kimi-k2.5, deepseek-r1, llama-3.3-70b, and mistral-large-2.
Documentation
massgen/configs/BACKEND_CONFIGURATION.md
Added Nvidia NIM provider base URL and NGC_API_KEY environment variable documentation.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 4

❌ Failed checks (3 warnings, 1 inconclusive)

Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description consists entirely of the template structure with no actual content filled in. All sections lack concrete information about the changes, testing, or context. Fill in the Description, Type of change, Pre-commit status, How to Test (CLI commands and expected results), and Additional context sections with actual information about the Nvidia NIM backend implementation.
Documentation Updated ⚠️ Warning Nvidia NIM backend implementation exists but documentation is significantly incomplete across multiple files. Add nvidia_nim to backend types table in backends.rst, update yaml_schema.rst, add to CLI parser choices, create example config file, and verify consistency.
Capabilities Registry Check ⚠️ Warning nvidia_nim backend is registered but missing pricing entries in PROVIDER_PRICING dictionary required for cost tracking. Add pricing entries for nvidia_nim models to PROVIDER_PRICING in token_manager.py following existing provider patterns.
Config Parameter Sync ❓ Inconclusive Required functions get_base_excluded_config_params() do not exist in massgen/backend/base.py or massgen/api_params_handler/_api_params_handler_base.py for registering new YAML parameters. Clarify whether these functions exist with different names or if they should be created to register Nvidia NIM's new YAML parameters (NGC_API_KEY, base_url).
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The PR title clearly and specifically describes the main change: adding Nvidia NIM as a first-class backend provider. It follows the required 'feat:' format and is concise.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/nvidia-nim-backend

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
massgen/backend/capabilities.py (1)

575-588: Add model_release_dates for the new Nvidia NIM model entry.

Line 583 introduces an explicit model list, but this new backend entry does not include model_release_dates. Please add release-date metadata for the listed concrete model(s) to keep registry metadata consistent.

Based on learnings: When adding new models, always update massgen/backend/capabilities.py by adding to the models list (newest first), adding to model_release_dates, and updating supported_capabilities if new features are introduced.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@massgen/backend/capabilities.py` around lines 575 - 588, The new
BackendCapabilities entry for "nvidia_nim" is missing model_release_dates
metadata; update the same backend object by adding a model_release_dates mapping
that includes release dates for the concrete models listed in models (e.g.,
"moonshotai/kimi-k2.5" and "custom") so registrations remain consistent; locate
the BackendCapabilities instantiation for backend_type "nvidia_nim" and add a
model_release_dates field (following the format used elsewhere in this file)
with the newest model first and appropriate ISO date strings for each model
name.
massgen/cli.py (2)

1662-1688: Use Google-style sections in the updated create_backend docstring.

create_backend() was changed, but its docstring is still free-form bullets. Please convert to Google-style (Args, Returns, Raises) for repository consistency.

As per coding guidelines **/*.py: "For new or changed functions, include Google-style docstrings".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@massgen/cli.py` around lines 1662 - 1688, Update the create_backend()
docstring to Google-style sections: add an "Args" section describing the
parameters (e.g., backend type and params dict), a "Returns" section describing
the created backend instance, and a "Raises" section for possible exceptions
(e.g., missing API key or unsupported backend). Move the existing
supported-backend bullets into a "Raises" or "Args" note as appropriate (or an
"Raises/Notes" subsection) and include the auto-detected chatcompletion
providers under "Notes" or "Raises" as contextual info; keep the descriptive
list of backend types and external dependencies but restructure them into the
Google-style sections to match repo conventions.

1811-1819: Use _api_key_error_message() for the Nvidia NIM branch.

Line 1814 hardcodes a long error string while this function already has _api_key_error_message(). Reusing it improves consistency and keeps linter warnings down.

Suggested patch
             elif base_url and "nvidia.com" in base_url:
                 api_key = os.getenv("NGC_API_KEY")
                 if not api_key:
-                    raise ConfigurationError(
-                        "Nvidia NIM API key not found. Set NGC_API_KEY environment variable.\n"
-                        "You can add it to a .env file in:\n"
-                        "  - Current directory: .env\n"
-                        "  - Global config: ~/.massgen/.env",
-                    )
+                    raise ConfigurationError(
+                        _api_key_error_message("Nvidia NIM", "NGC_API_KEY", config_path),
+                    )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@massgen/cli.py` around lines 1811 - 1819, Replace the hardcoded Nvidia NIM
API key error string in the branch that checks `elif base_url and "nvidia.com"
in base_url` with a call to the existing helper `_api_key_error_message()`;
specifically, when `api_key = os.getenv("NGC_API_KEY")` is falsy, raise
`ConfigurationError(_api_key_error_message())` instead of the long literal so
the `api_key` check in that branch reuses the shared message logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@massgen/cli.py`:
- Around line 1915-1924: The CLI currently implements backend_type "nvidia_nim"
in the factory branch (the elif handling that returns ChatCompletionsBackend)
but the CLI argument parser's allowed backend choices still omits "nvidia_nim",
causing argparse validation to reject it; locate where the backend choices are
defined (e.g., the --backend add_argument call or a BACKEND_CHOICES /
SUPPORTED_BACKENDS list used by parser.add_argument) and add the string
"nvidia_nim" to that choices list and any corresponding help text or
documentation strings so the CLI accepts --backend nvidia_nim and maps to the
existing backend_type handling (keep symbols: backend_type,
ChatCompletionsBackend, and the --backend argument).

---

Nitpick comments:
In `@massgen/backend/capabilities.py`:
- Around line 575-588: The new BackendCapabilities entry for "nvidia_nim" is
missing model_release_dates metadata; update the same backend object by adding a
model_release_dates mapping that includes release dates for the concrete models
listed in models (e.g., "moonshotai/kimi-k2.5" and "custom") so registrations
remain consistent; locate the BackendCapabilities instantiation for backend_type
"nvidia_nim" and add a model_release_dates field (following the format used
elsewhere in this file) with the newest model first and appropriate ISO date
strings for each model name.

In `@massgen/cli.py`:
- Around line 1662-1688: Update the create_backend() docstring to Google-style
sections: add an "Args" section describing the parameters (e.g., backend type
and params dict), a "Returns" section describing the created backend instance,
and a "Raises" section for possible exceptions (e.g., missing API key or
unsupported backend). Move the existing supported-backend bullets into a
"Raises" or "Args" note as appropriate (or an "Raises/Notes" subsection) and
include the auto-detected chatcompletion providers under "Notes" or "Raises" as
contextual info; keep the descriptive list of backend types and external
dependencies but restructure them into the Google-style sections to match repo
conventions.
- Around line 1811-1819: Replace the hardcoded Nvidia NIM API key error string
in the branch that checks `elif base_url and "nvidia.com" in base_url` with a
call to the existing helper `_api_key_error_message()`; specifically, when
`api_key = os.getenv("NGC_API_KEY")` is falsy, raise
`ConfigurationError(_api_key_error_message())` instead of the long literal so
the `api_key` check in that branch reuses the shared message logic.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 70a8784 and 9d5e340.

📒 Files selected for processing (7)
  • massgen/backend/capabilities.py
  • massgen/backend/chat_completions.py
  • massgen/cli.py
  • massgen/config_builder.py
  • massgen/configs/BACKEND_CONFIGURATION.md
  • massgen/utils/model_catalog.py
  • massgen/utils/model_matcher.py

Comment thread massgen/cli.py
Comment on lines +1915 to +1924
elif backend_type == "nvidia_nim":
# Nvidia NIM uses OpenAI-compatible Chat Completions API
api_key = kwargs.get("api_key") or os.getenv("NGC_API_KEY")
if not api_key:
raise ConfigurationError(
_api_key_error_message("Nvidia NIM", "NGC_API_KEY", config_path),
)
if "base_url" not in kwargs:
kwargs["base_url"] = "https://integrate.api.nvidia.com/v1"
return ChatCompletionsBackend(api_key=api_key, **kwargs)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

nvidia_nim is implemented here but still not selectable via --backend.

Line 1915 adds first-class handling, but CLI parser choices (later in this file) still omit "nvidia_nim", so massgen --backend nvidia_nim ... fails argument validation.

Suggested patch
@@
     config_group.add_argument(
         "--backend",
         type=str,
         choices=[
             "chatcompletion",
             "claude",
             "gemini",
             "grok",
             "openai",
             "azure_openai",
             "copilot",
             "claude_code",
             "zai",
+            "nvidia_nim",
             "lmstudio",
             "vllm",
             "sglang",
         ],
         help="Backend type for quick setup",
     )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@massgen/cli.py` around lines 1915 - 1924, The CLI currently implements
backend_type "nvidia_nim" in the factory branch (the elif handling that returns
ChatCompletionsBackend) but the CLI argument parser's allowed backend choices
still omits "nvidia_nim", causing argparse validation to reject it; locate where
the backend choices are defined (e.g., the --backend add_argument call or a
BACKEND_CHOICES / SUPPORTED_BACKENDS list used by parser.add_argument) and add
the string "nvidia_nim" to that choices list and any corresponding help text or
documentation strings so the CLI accepts --backend nvidia_nim and maps to the
existing backend_type handling (keep symbols: backend_type,
ChatCompletionsBackend, and the --backend argument).

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds Nvidia NIM as a supported OpenAI-compatible provider across MassGen’s provider registry, CLI/backend factory, model discovery, and documentation, making it selectable/configurable like other first-class backends.

Changes:

  • Register nvidia_nim in backend capabilities (models, env var, base URL) and provider-name inference.
  • Add CLI/config-builder support for selecting nvidia_nim and guiding users to NGC_API_KEY.
  • Enable model listing via the OpenAI-compatible /models endpoint and expand fuzzy model matching + docs.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
massgen/utils/model_matcher.py Adds curated common model IDs and includes nvidia_nim in chatcompletion providers for model fetching/fuzzy matching.
massgen/utils/model_catalog.py Adds nvidia_nim branch to fetch models from Nvidia’s OpenAI-compatible endpoint.
massgen/configs/BACKEND_CONFIGURATION.md Documents Nvidia NIM base URL and NGC_API_KEY.
massgen/config_builder.py Allows nvidia_nim in smart model selection and interactive API key setup.
massgen/cli.py Adds nvidia_nim backend type and auto-detection of NGC_API_KEY for chatcompletion base URLs.
massgen/backend/chat_completions.py Adds provider-name inference for Nvidia NIM based on base URL.
massgen/backend/capabilities.py Registers nvidia_nim in BACKEND_CAPABILITIES with base URL, env var, models, and notes.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +583 to +584
models=["moonshotai/kimi-k2.5", "custom"],
default_model="moonshotai/kimi-k2.5",
Copy link

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ConfigValidator treats backends with default_model != "custom" as not requiring an explicit model, but the Chat Completions request builder here does not inject a default model. That means configs using type: nvidia_nim without a model will pass validation yet fail at runtime (OpenAI-compatible APIs require model). Consider setting default_model to "custom" (and/or only listing "custom" in models) or adding logic elsewhere to auto-fill model from capabilities when it’s missing.

Suggested change
models=["moonshotai/kimi-k2.5", "custom"],
default_model="moonshotai/kimi-k2.5",
models=["custom"],
default_model="custom",

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No default model should be moonshotai/kimi-k2.5

Comment on lines 1103 to 1104
elif "nebius.com" in base_url:
return "Nebius AI Studio"
Copy link

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nebius provider detection checks only for nebius.com, but the default Nebius base URL used elsewhere is api.studio.nebius.ai. With the current condition, get_provider_name() will not recognize Nebius when using the default .ai endpoint, which can affect logging/routing that relies on the inferred provider name. Update the check to also match nebius.ai (and/or studio.nebius.ai).

Copilot uses AI. Check for mistakes.
Comment thread massgen/cli.py
Comment on lines +1815 to +1818
"Nvidia NIM API key not found. Set NGC_API_KEY environment variable.\n"
"You can add it to a .env file in:\n"
" - Current directory: .env\n"
" - Global config: ~/.massgen/.env",
Copy link

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This branch adds a new provider-specific API-key-missing message, but it duplicates logic and diverges from the standard _api_key_error_message(...) helper used elsewhere in create_backend (different listed .env locations, no --setup hint, etc.). Consider reusing _api_key_error_message("Nvidia NIM", "NGC_API_KEY", config_path) here for consistency and easier future maintenance.

Suggested change
"Nvidia NIM API key not found. Set NGC_API_KEY environment variable.\n"
"You can add it to a .env file in:\n"
" - Current directory: .env\n"
" - Global config: ~/.massgen/.env",
_api_key_error_message("Nvidia NIM", "NGC_API_KEY", config_path),

Copilot uses AI. Check for mistakes.
@Henry-811 Henry-811 changed the base branch from main to dev/v0.1.58 March 2, 2026 17:20
@Henry-811 Henry-811 merged commit e43f11f into dev/v0.1.58 Mar 2, 2026
21 of 23 checks passed
@coderabbitai coderabbitai Bot mentioned this pull request Mar 2, 2026
18 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants