feat: New rail, huggingface lightweight classifiers by m-misiura · Pull Request #1853 · NVIDIA-NeMo/Guardrails

m-misiura · 2026-05-06T14:26:37Z

Description

Motivation

Currently, the content_safety rail is a good demonstration of a flexible pattern where users can point to any supported engine (i.e. nim, openai, vllm, etc) and configure (generative) transformer model to carry out detections
Based on customer feedback, there is a considerable interest in using lightweight (discriminative) transformer models (specifically token and sequence classifiers) to carry out detections. These models cannot be readily configured via the content_safety rail since:

different output format: classifiers return label/score pairs, not generated text to be parsed
different inference APIs: a BERT model served on KServe uses the v1/v2 predict API, not the OpenAI chat completions API that NeMo's LangChain providers expect.
no backend abstraction: existing HF-based rails (gliner, jailbreak_detection) hardcode their inference backend; users cannot swap between e.g. local HuggingFace Transformers, KServe, vLLM, or HuggingFace Inference Endpoints without changing code.

Currently, the recommended approach to configure such models is via custom actions, which can be cumbersome.

Proposed feature

This PR adds a pluggable hf_classifier rail that lets users run HuggingFace-compatible classification models as input/output guardrails. Classifiers are defined in rails.config.hf_classifier and referenced by name in flows via $classifier=.

It support the following backends:

local — HuggingFace Transformers pipeline (runs in-process, supports text-classification and token-classification tasks)
vllm — vLLM /classify endpoint
kserve — KServe v1 /v1/models/{name}:predict endpoint
fms — FMS guardrails-detectors /api/v1/text/contents endpoint

Test plan

Added additional test files:

test_hf_classifier.py
test_hf_classifier_iorails.py

Created a demo to test against the live server

Checklist

I've read the CONTRIBUTING guidelines.
I've updated the documentation if applicable.
I've added tests if applicable.
@mentions of the person or team responsible for reviewing proposed changes.

Summary by CodeRabbit

New Features
- HuggingFace classifier integration for moderation across user input, bot output, retrievals, and tool I/O.
- Support for multiple classifier backends (Local, vLLM, KServe, FMS) and pre-warming for faster responses.
- Exposed events history cache for improved observability.
- Improved streaming responses with structured error handling.
Chores
- Added optional HF-classifier dependencies (transformers, torch).
Tests
- Comprehensive tests covering HF classifier behavior and integration.

codecov · 2026-05-06T16:13:02Z

Codecov Report

❌ Patch coverage is 92.79476% with 33 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
nemoguardrails/library/hf_classifier/backends.py	90.20%	24 Missing ⚠️
nemoguardrails/guardrails/guardrails.py	76.92%	3 Missing ⚠️
nemoguardrails/server/api.py	94.44%	3 Missing ⚠️
...rdrails/guardrails/actions/hf_classifier_action.py	96.55%	2 Missing ⚠️
nemoguardrails/library/hf_classifier/actions.py	98.03%	1 Missing ⚠️

📢 Thoughts on this report? Let us know!

greptile-apps · 2026-05-06T16:35:53Z

Greptile Summary

This PR introduces a pluggable hf_classifier rail that lets users run HuggingFace-compatible classification models (local Transformers pipeline, vLLM, KServe, FMS) as input/output guardrails without requiring a generative LLM. The implementation adds config models, IORails action classes, Colang flows, backend abstractions, startup prewarming, and comprehensive tests.

New hf_classifier rail: adds LocalHFClassifierConfig / RemoteHFClassifierConfig (discriminated union), four backend classes (LocalBackend, VLLMBackend, KServeBackend, FMSBackend), and Colang flows for input, output, retrieval, and tool I/O checks.
IORails integration: HFClassifierInputAction / HFClassifierOutputAction are registered in the IORails engine registry, and the corresponding flow names are added to IORAILS_INPUT_FLOWS / IORAILS_OUTPUT_FLOWS, enabling fast classifier-only deployments with no LLM dependency.
Server changes: _prewarm_local_hf_classifiers eagerly loads local pipelines at startup (parallel threads, one per model), close_all_backends() is wired to app shutdown, and llm_rails_instances now stores Guardrails (the IORails/LLMRails wrapper) instead of bare LLMRails.

Confidence Score: 4/5

Safe to merge with awareness of the open issues flagged in previous review rounds.

The tool-input classifier always receives an empty string because tool_message is never written to the context by the runtime (flagged in prior review), meaning tool-input classification silently passes every request. That functional gap, combined with a smaller session-leak concern on config hot-reload, keeps the score from the top.

nemoguardrails/library/hf_classifier/actions.py (tool_message context key) and nemoguardrails/library/hf_classifier/backends.py (session lifecycle on hot-reload).

Important Files Changed

Filename	Overview
nemoguardrails/library/hf_classifier/backends.py	New file: pluggable backends (Local, vLLM, KServe, FMS) for HF classifier inference; addressed previous cache-key and event-loop concerns, but aiohttp sessions are only closed at full shutdown (not on hot-reload).
nemoguardrails/library/hf_classifier/actions.py	New file: Colang-callable action functions for input, output, retrieval, tool-input and tool-output classification; `tool_message` context key is never set by the runtime (flagged in previous review threads), causing tool-input classification to always allow.
nemoguardrails/guardrails/actions/hf_classifier_action.py	New file: IORails-path HFClassifierInputAction and HFClassifierOutputAction; threshold/blocked_labels filtering is applied client-side in _parse_classify_response, design looks sound.
nemoguardrails/server/api.py	Pre-warm logic now uses max_workers=len(pending) (parallel per-model threads), fixing the previous single-worker sequential issue; close_all_backends() is called on shutdown, but not on config hot-reload.
nemoguardrails/guardrails/guardrails.py	Adds hf_classifier flows to IORAILS_INPUT/OUTPUT_FLOWS whitelists; exposes events_history_cache property (no-op for IORails); guards against empty-flows configs routing to IORails.
nemoguardrails/rails/llm/config.py	Adds LocalHFClassifierConfig, RemoteHFClassifierConfig, and discriminated-union HFClassifierConfig; validation is thorough (threshold bounds, endpoint scheme, aggregation_strategy guard).
nemoguardrails/guardrails/rail_action.py	Adds _param_prefix class attribute so subclasses can customise the error message when a required parameter is missing; change is minimal and correct.
nemoguardrails/guardrails/rails_manager.py	Registers HFClassifierInputAction and HFClassifierOutputAction in the IORails action registry; straightforward, no issues.
nemoguardrails/library/hf_classifier/flows.co	Colang 2 flows for input, output, retrieval, tool-input and tool-output HF classifier checks; looks correct.
tests/test_hf_classifier.py	Comprehensive backend and action unit tests with cache-clearing fixtures; coverage is extensive.

Sequence Diagram

sequenceDiagram
    participant Client
    participant API as server/api.py
    participant GR as Guardrails
    participant IOR as IORails
    participant RA as HFClassifierInputAction
    participant BE as ClassifierBackend

    Note over API: startup: _prewarm_local_hf_classifiers()
    API->>BE: _get_or_create_pipeline() [thread pool]
    BE-->>API: pipeline cached

    Client->>API: POST /v1/chat/completions
    API->>GR: _get_rails(config_ids)
    GR->>IOR: startup() [if IORails config]
    API->>GR: generate_async(messages)
    GR->>IOR: generate(messages)
    IOR->>RA: "run(flow='hf classifier check input $classifier=toxicity')"
    RA->>BE: get_backend(cfg, name)
    RA->>BE: classify(text)
    BE-->>RA: [ClassificationResult, ...]
    RA-->>IOR: "RailResult(is_safe=True/False)"
    alt "is_safe=False"
        IOR-->>GR: blocked response
    else "is_safe=True"
        IOR-->>GR: pass-through / LLM call
    end
    GR-->>API: response
    API-->>Client: chat completion

    Note over API: shutdown: close_all_backends()
    API->>BE: close() all sessions

Prompt To Fix All With AI

Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
nemoguardrails/library/hf_classifier/backends.py:710-730
**Backend sessions leaked on config hot-reload**

`close_all_backends()` is only called at full app shutdown. When auto-reload fires and removes a stale config from `llm_rails_instances`, its backends stay in the process-global `_backend_instances` dict with open `aiohttp.ClientSession` objects. If the reloaded config changes `endpoint`, `threshold`, or any other field, `get_backend` will create a fresh backend under a new key while the old session's TCP connections remain open indefinitely, leaking file descriptors until the Python GC eventually raises `ResourceWarning: Unclosed client session`.

_{Reviews (8): Last reviewed commit: ":construction: support hf classifiers as..." | Re-trigger Greptile}

coderabbitai · 2026-05-06T16:41:45Z

📝 Walkthrough

Walkthrough

Adds a HuggingFace classifier subsystem integrated into Guardrails/IORails: new configs, pluggable backends (local/vLLM/KServe/FMS), action helpers and IORails actions, flow definitions, server prewarm/streaming updates, rails wiring, and comprehensive tests.

Changes

HF Classifier Guardrails System

Layer / File(s)	Summary
Configuration & Types `nemoguardrails/rails/llm/config.py`, `pyproject.toml`	Adds `HFClassifierConfig` discriminated union with `LocalHFClassifierConfig` and `RemoteHFClassifierConfig`; adds `hf_classifier` field to `RailsConfigData`; declares optional deps `transformers` and `torch` and `hf-classifier` extra.
Data / Cache Utilities `nemoguardrails/library/hf_classifier/__init__.py`, `nemoguardrails/library/hf_classifier/backends.py`	Introduces `ClassificationResult` TypedDict, backend registry, pipeline cache, SSL/timeout helpers, and sentinel for failed loads.
Backend Implementations `nemoguardrails/library/hf_classifier/backends.py`	Implements `ClassifierBackend` interface and concrete backends: `LocalBackend`, `VLLMBackend`, `KServeBackend`, `FMSBackend`; includes HTTP/SSL/session handling and `get_backend` factory.
Library-Level Actions `nemoguardrails/library/hf_classifier/actions.py`	Adds `_classify_and_check` helper and five public actions: `hf_classifier_check_input`, `hf_classifier_check_output`, `hf_classifier_check_retrieval`, `hf_classifier_check_tool_input`, `hf_classifier_check_tool_output` that extract text and apply threshold-based blocking.
IORails Action Classes `nemoguardrails/guardrails/actions/hf_classifier_action.py`	Adds `HFClassifierInputAction` and `HFClassifierOutputAction` that extract classifier from flow params, build prompts, call classifier via task manager, and parse responses into `RailResult`.
Flows `nemoguardrails/library/hf_classifier/flows.co`, `nemoguardrails/library/hf_classifier/flows.v1.co`	Adds flows for input, output, retrieval, tool-input, tool-output checks; includes denial handling (rail exceptions or canned refusal) and a v1 refusal set.
Framework Integration & Wiring `nemoguardrails/guardrails/guardrails.py`, `nemoguardrails/guardrails/rail_action.py`, `nemoguardrails/guardrails/rails_manager.py`	Adds `IORAILS_INPUT_FLOWS`/`IORAILS_OUTPUT_FLOWS` constants; adds `events_history_cache` property on `Guardrails`; makes `_param_prefix` configurable on `RailAction`; registers HF classifier actions in `_ACTION_CLASSES`.
Server Integration & Prewarm `nemoguardrails/server/api.py`	Switches server instances to use `Guardrails`, adds `_prewarm_local_hf_classifiers()` with timeout/pipeline caching, updates chat generation/streaming to use `Guardrails` APIs, adds streaming chunk/error models and `GuardrailsConfigurationError`, and closes hf backends on shutdown.
Tests & Validation `tests/test_hf_classifier.py`, `tests/test_hf_classifier_iorails.py`	Adds comprehensive tests for configs, header/SSL/timeout behavior, backend integrations, prewarm logic, sentinel/cache behavior, action wiring, classifier extraction, threshold blocking, and IORails flow registration.

Sequence Diagrams

sequenceDiagram
    actor User
    participant ServerAPI as Server API
    participant Guardrails
    participant IORails as IORails Flow
    participant HFBackend as HF Classifier Backend
    participant LLM

    User->>ServerAPI: POST /chat/completions (message)
    ServerAPI->>Guardrails: generate_async()
    Guardrails->>IORails: execute input flow
    IORails->>HFBackend: classify(user_input)
    HFBackend-->>IORails: labels & scores
    IORails-->>Guardrails: allowed / blocked
    alt Allowed
        Guardrails->>LLM: forward message
        LLM-->>Guardrails: bot_output
        Guardrails->>IORails: execute output flow
        IORails->>HFBackend: classify(bot_output)
        HFBackend-->>IORails: labels & scores
        IORails-->>Guardrails: allowed / blocked
        Guardrails-->>ServerAPI: bot response (if allowed)
    else Blocked
        Guardrails-->>ServerAPI: refusal response
    end

sequenceDiagram
    participant Startup
    participant APIConfig as API Config
    participant ConfigLoader as RailsConfig Loader
    participant PipelineCache as Local HF Pipeline Cache
    participant Transformers as transformers

    Startup->>APIConfig: lifespan startup
    APIConfig->>ConfigLoader: load configs
    ConfigLoader-->>APIConfig: {hf_classifier: configs}
    loop per local classifier
        APIConfig->>PipelineCache: _get_or_create_pipeline(model_id, task)
        alt cached
            PipelineCache-->>APIConfig: cached pipeline
        else
            PipelineCache->>Transformers: pipeline(model_id, task)
            Transformers-->>PipelineCache: loaded pipeline
            PipelineCache->>PipelineCache: cache and return
        end
    end
    APIConfig-->>Startup: prewarm complete

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 14.11% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Test Results For Major Changes	⚠️ Warning	Major feature PR lacks documented test results. While 87 unit tests exist in the codebase, the PR description provides no test execution results, coverage metrics, or confirmation of test passage.	Add test execution summary to PR description showing: test pass count, test execution confirmation (e.g., CI logs), and code coverage metrics. Note that test checklist items are marked incomplete.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'feat: New rail, huggingface lightweight classifiers' accurately describes the main change in the PR—adding a new HuggingFace classifier-based rail system with pluggable backends and configuration support.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

Generate code and open pull requests
Plan features and break down work
Investigate incidents and troubleshoot customer tickets together
Automate recurring tasks and respond to alerts with triggers
Summarize progress and report instantly

Built for teams:

Shared memory across your entire org—no repeating context
Per-thread sandboxes to safely plan and execute work
Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

greptile-apps · 2026-05-06T16:51:21Z

+    def __init__(self, config: RemoteHFClassifierConfig) -> None:
+        self._url = config.endpoint.rstrip("/") + "/api/v1/text/contents"
+        self._threshold = config.threshold
+        self._headers = _build_headers(config)
+        self._timeout = _get_timeout(config)
+        self._ssl = _build_ssl_context(config)


FMS backend threshold baked into instance, not reflected in cache key

FMSBackend.__init__ stores self._threshold = config.threshold and embeds it in every request payload. However, get_backend caches the instance by "{name}:{backend}:{model_name}:{endpoint}" — which does not include the threshold. If two deployed configs share the same classifier name, backend="fms", model_name, and endpoint but specify different threshold values, the second config's requests will silently use the first config's threshold. This is a present correctness defect in multi-config server deployments: the FMS server will filter at the wrong sensitivity level without any warning.

Prompt To Fix With AI

This is a comment left during a code review. Path: nemoguardrails/library/hf_classifier/backends.py Line: 363-368 Comment: **FMS backend threshold baked into instance, not reflected in cache key** `FMSBackend.__init__` stores `self._threshold = config.threshold` and embeds it in every request payload. However, `get_backend` caches the instance by `"{name}:{backend}:{model_name}:{endpoint}"` — which does not include the threshold. If two deployed configs share the same classifier `name`, `backend="fms"`, `model_name`, and `endpoint` but specify different `threshold` values, the second config's requests will silently use the first config's threshold. This is a present correctness defect in multi-config server deployments: the FMS server will filter at the wrong sensitivity level without any warning. How can I resolve this? If you propose a fix, please make it concise.

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (2)

nemoguardrails/guardrails/rail_action.py (1)
65-65: ⚡ Quick win

Keep parameter parsing aligned with _param_prefix.

Right now _param_prefix only affects the Line 92 error text. Line 113 still parses via _get_flow_model(flow), so the base class remains hard-wired to $model=. That makes this hook easy to misuse: a subclass can override _param_prefix and still silently parse the wrong flow argument.

Also applies to: 90-113
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@nemoguardrails/guardrails/rail_action.py` at line 65, The class currently
hard-codes parsing of the flow model to "$model=" via _get_flow_model(flow)
while allowing subclasses to override _param_prefix; change the parsing to use
the instance _param_prefix everywhere instead of the literal "$model=": update
the logic that extracts the model from flow (the call/implementation of
_get_flow_model and the flow-parsing block around lines 90–113) to read
self._param_prefix when matching and extracting the value, and also update any
error messages that reference the prefix to use self._param_prefix so subclasses
that override _param_prefix get consistent parsing and errors.
tests/test_hf_classifier.py (1)
739-766: 💤 Low value

Verify TimeoutError type in timeout test.

At line 754, the test patches _get_or_create_pipeline to raise TimeoutError. However, the actual timeout in _prewarm_local_hf_classifiers is caught via future.result(timeout=...), which raises concurrent.futures.TimeoutError. The test's side_effect=TimeoutError("timed out") would raise the builtin TimeoutError, not concurrent.futures.TimeoutError.

Since line 165 in api.py catches TimeoutError imported from concurrent.futures, and the builtin TimeoutError is a parent class, this should still work, but the test doesn't accurately simulate the real timeout scenario (where the executor times out waiting for the future, not the function itself raising TimeoutError).

Consider clarifying the test or mocking at the executor level for more accurate simulation.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_hf_classifier.py` around lines 739 - 766, The test currently uses
builtin TimeoutError as side_effect for _get_or_create_pipeline, but
_prewarm_local_hf_classifiers actually observes a
concurrent.futures.TimeoutError raised from future.result; update the test to
accurately simulate that by either setting side_effect to
concurrent.futures.TimeoutError("timed out") (importing concurrent.futures) or
by patching the Future.result method returned by the executor so it raises
concurrent.futures.TimeoutError; keep the assertions and references to
_prewarm_local_hf_classifiers, _get_or_create_pipeline, _PipelineLoadError and
_pipeline_cache_key unchanged.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@nemoguardrails/library/hf_classifier/actions.py`:
- Around line 128-140: The current hf_classifier_check_tool_output assumes
tool_calls are JSON-serializable and calls json.dumps(calls) which can raise on
custom objects; update hf_classifier_check_tool_output to safely convert
tool_calls to text before calling _classify_and_check (e.g., try
json.dumps(calls, default=str) or catch TypeError and fall back to joining
repr()/str() of each call) so non-serializable tool-call objects produce a
stable string for classification instead of crashing; keep references to the
existing parameters and call to _classify_and_check unchanged.

In `@nemoguardrails/library/hf_classifier/backends.py`:
- Around line 431-442: get_backend currently builds cache_key from
name/backend/model_name/endpoint which misses config fields like
task/parameters/threshold and leads to stale cached backends; change cache_key
construction in get_backend to include a deterministic serialization of the
entire HFClassifierConfig (for example use dataclasses.asdict(config) or
config.__dict__ then json.dumps(..., sort_keys=True, default=repr)) and append
that serialized string to cache_key so different configs produce distinct keys;
keep using _backend_instances and _BACKENDS and ensure the serialization is
stable and does not include non-serializable transient state.

---

Nitpick comments:
In `@nemoguardrails/guardrails/rail_action.py`:
- Line 65: The class currently hard-codes parsing of the flow model to "$model="
via _get_flow_model(flow) while allowing subclasses to override _param_prefix;
change the parsing to use the instance _param_prefix everywhere instead of the
literal "$model=": update the logic that extracts the model from flow (the
call/implementation of _get_flow_model and the flow-parsing block around lines
90–113) to read self._param_prefix when matching and extracting the value, and
also update any error messages that reference the prefix to use
self._param_prefix so subclasses that override _param_prefix get consistent
parsing and errors.

In `@tests/test_hf_classifier.py`:
- Around line 739-766: The test currently uses builtin TimeoutError as
side_effect for _get_or_create_pipeline, but _prewarm_local_hf_classifiers
actually observes a concurrent.futures.TimeoutError raised from future.result;
update the test to accurately simulate that by either setting side_effect to
concurrent.futures.TimeoutError("timed out") (importing concurrent.futures) or
by patching the Future.result method returned by the executor so it raises
concurrent.futures.TimeoutError; keep the assertions and references to
_prewarm_local_hf_classifiers, _get_or_create_pipeline, _PipelineLoadError and
_pipeline_cache_key unchanged.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 062a145f-97ca-4108-8dff-e5b00f23bb9b

📥 Commits

Reviewing files that changed from the base of the PR and between c69efe5 and a95a49d.

⛔ Files ignored due to path filters (1)

poetry.lock is excluded by !**/*.lock

📒 Files selected for processing (14)

nemoguardrails/guardrails/actions/hf_classifier_action.py
nemoguardrails/guardrails/guardrails.py
nemoguardrails/guardrails/rail_action.py
nemoguardrails/guardrails/rails_manager.py
nemoguardrails/library/hf_classifier/__init__.py
nemoguardrails/library/hf_classifier/actions.py
nemoguardrails/library/hf_classifier/backends.py
nemoguardrails/library/hf_classifier/flows.co
nemoguardrails/library/hf_classifier/flows.v1.co
nemoguardrails/rails/llm/config.py
nemoguardrails/server/api.py
pyproject.toml
tests/test_hf_classifier.py
tests/test_hf_classifier_iorails.py

coderabbitai · 2026-05-06T16:58:24Z

+async def hf_classifier_check_tool_output(
+    classifier: str,
+    tool_calls: Optional[Any] = None,
+    config: Optional[RailsConfig] = None,
+    context: Optional[dict] = None,
+    **kwargs,
+) -> bool:
+    """Check tool output against a HuggingFace classifier."""
+    if not tool_calls and context:
+        tool_calls = context.get("tool_calls") or []
+    calls = tool_calls or []
+    text = json.dumps(calls) if calls else ""
+    return await _classify_and_check(classifier, text, config)


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't assume tool calls are directly JSON-serializable.

Line 139 will fail for common tool-call objects that are not plain dict/list primitives. Since this action accepts Any, a non-serializable tool call currently turns the rail into a runtime error instead of a classification decision.

Suggested fix

`@action`(is_system_action=True) async def hf_classifier_check_tool_output( classifier: str, tool_calls: Optional[Any] = None, config: Optional[RailsConfig] = None, context: Optional[dict] = None, **kwargs, ) -> bool: """Check tool output against a HuggingFace classifier.""" if not tool_calls and context: tool_calls = context.get("tool_calls") or [] calls = tool_calls or [] - text = json.dumps(calls) if calls else "" + text = ( + json.dumps( + calls, + default=lambda value: value.model_dump() if hasattr(value, "model_dump") else str(value), + ) + if calls + else "" + ) return await _classify_and_check(classifier, text, config)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@nemoguardrails/library/hf_classifier/actions.py` around lines 128 - 140, The current hf_classifier_check_tool_output assumes tool_calls are JSON-serializable and calls json.dumps(calls) which can raise on custom objects; update hf_classifier_check_tool_output to safely convert tool_calls to text before calling _classify_and_check (e.g., try json.dumps(calls, default=str) or catch TypeError and fall back to joining repr()/str() of each call) so non-serializable tool-call objects produce a stable string for classification instead of crashing; keep references to the existing parameters and call to _classify_and_check unchanged.

coderabbitai · 2026-05-06T16:58:24Z

+def get_backend(config: HFClassifierConfig, name: str = "") -> ClassifierBackend:
+    """Get or create a cached backend instance from classifier config."""
+    endpoint = getattr(config, "endpoint", "local")
+    cache_key = f"{name}:{config.backend}:{config.model_name}:{endpoint}"
+    cached = _backend_instances.get(cache_key)
+    if cached is not None:
+        return cached
+    cls = _BACKENDS.get(config.backend)
+    if cls is None:
+        raise ValueError(f"Unknown hf_classifier backend: '{config.backend}'. Supported: {', '.join(_BACKENDS)}")
+    _backend_instances[cache_key] = cls(config)
+    return _backend_instances[cache_key]


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Include the full classifier config in the backend cache key.

Line 434 only keys on name/backend/model_name/endpoint, but backend behavior also depends on fields you omit here: e.g. task and parameters for LocalBackend, and threshold for FMSBackend. Two different classifier configs can therefore reuse the same cached instance and run with stale settings.

Suggested fix

+import json + def get_backend(config: HFClassifierConfig, name: str = "") -> ClassifierBackend: """Get or create a cached backend instance from classifier config.""" - endpoint = getattr(config, "endpoint", "local") - cache_key = f"{name}:{config.backend}:{config.model_name}:{endpoint}" + cache_key = json.dumps( + { + "name": name, + "config": config.model_dump(mode="json"), + }, + sort_keys=True, + ) cached = _backend_instances.get(cache_key) if cached is not None: return cached

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@nemoguardrails/library/hf_classifier/backends.py` around lines 431 - 442, get_backend currently builds cache_key from name/backend/model_name/endpoint which misses config fields like task/parameters/threshold and leads to stale cached backends; change cache_key construction in get_backend to include a deterministic serialization of the entire HFClassifierConfig (for example use dataclasses.asdict(config) or config.__dict__ then json.dumps(..., sort_keys=True, default=repr)) and append that serialized string to cache_key so different configs produce distinct keys; keep using _backend_instances and _BACKENDS and ensure the serialization is stable and does not include non-serializable transient state.

greptile-apps · 2026-05-06T17:31:17Z

Want your agent to iterate on Greptile's feedback? Try greploops.

Pouyanpi · 2026-05-26T14:30:36Z

@m-misiura Thanks for the PR, Looks good, I have some high-level comments:

I think this PR should keep the server path on LLMRails and leave the experimental Guardrails/IORails routing out of scope. Ideally we shouldn't (need to) touch the server code in library integrations.

There are two HF classifier state issues that still need attention even if IORails is removed: remote classifier backends cache aiohttp ClientSession objects that can be reused after their original event loop is gone, and local classifier prewarm timeouts can leave the model loading thread running long enough to overwrite the timeout sentinel in _pipelines.

I know aiohttp is one of our core dependencies but we might drop it, any issues with using httpx? I would not make this a blocker if it turns into a larger refactor.

Let's remove the IORails path from this PR, but keep its commit so we can come back to it later if needed. I can do a final review once those changes are in. 👍🏻

m-misiura force-pushed the hf_classifiers branch from f77c1e9 to 15a0c2c Compare May 6, 2026 14:37

Pouyanpi force-pushed the develop branch 2 times, most recently from 6398f99 to 930faa6 Compare May 6, 2026 15:51

m-misiura force-pushed the hf_classifiers branch from d913aa8 to 28bb81a Compare May 6, 2026 15:52

Pouyanpi force-pushed the develop branch from 930faa6 to a6be550 Compare May 6, 2026 15:53

m-misiura force-pushed the hf_classifiers branch 2 times, most recently from 004d830 to 9b58539 Compare May 6, 2026 15:59

Pouyanpi force-pushed the develop branch from a6be550 to c69efe5 Compare May 6, 2026 16:01

m-misiura force-pushed the hf_classifiers branch 2 times, most recently from 863ac8d to 5431798 Compare May 6, 2026 16:04

m-misiura marked this pull request as ready for review May 6, 2026 16:30

m-misiura force-pushed the hf_classifiers branch from 593cb52 to a95a49d Compare May 6, 2026 16:45

greptile-apps Bot reviewed May 6, 2026

View reviewed changes

m-misiura force-pushed the hf_classifiers branch from 2a05f03 to d04e3ad Compare May 6, 2026 16:57

coderabbitai Bot reviewed May 6, 2026

View reviewed changes

m-misiura force-pushed the hf_classifiers branch 2 times, most recently from b53ba6f to beae802 Compare May 6, 2026 17:23

m-misiura force-pushed the hf_classifiers branch 2 times, most recently from bc805f3 to a8e424e Compare May 7, 2026 08:24

🚧 support hf classifiers as rails

e4a644a

m-misiura force-pushed the hf_classifiers branch from 6121f5c to e4a644a Compare May 7, 2026 08:57

Pouyanpi self-requested a review May 26, 2026 14:07

Pouyanpi assigned m-misiura May 26, 2026

Pouyanpi added the enhancement New feature or request label May 26, 2026

Pouyanpi added this to the v0.23.0 milestone May 26, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: New rail, huggingface lightweight classifiers #1853

feat: New rail, huggingface lightweight classifiers #1853
m-misiura wants to merge 1 commit into
NVIDIA-NeMo:developfrom
m-misiura:hf_classifiers

m-misiura commented May 6, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

codecov Bot commented May 6, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented May 6, 2026 •

edited

Loading

Confidence Score: 4/5

Sequence Diagram

Uh oh!

coderabbitai Bot commented May 6, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagrams

Estimated code review effort

❌ Failed checks (2 warnings)

Uh oh!

greptile-apps Bot May 6, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 6, 2026

Uh oh!

coderabbitai Bot May 6, 2026

Uh oh!

greptile-apps Bot commented May 6, 2026

Uh oh!

Pouyanpi commented May 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

m-misiura commented May 6, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation

Proposed feature

Test plan

Checklist

Summary by CodeRabbit

Uh oh!

codecov Bot commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

greptile-apps Bot commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

coderabbitai Bot commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagrams

Estimated code review effort

❌ Failed checks (2 warnings)

Uh oh!

greptile-apps Bot May 6, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 6, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 6, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot commented May 6, 2026

Uh oh!

Pouyanpi commented May 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

m-misiura commented May 6, 2026 •

edited by coderabbitai Bot

Loading

codecov Bot commented May 6, 2026 •

edited

Loading

greptile-apps Bot commented May 6, 2026 •

edited

Loading

coderabbitai Bot commented May 6, 2026 •

edited

Loading