docs(library): add ATR-inspired threat detection example by eeee2345 · Pull Request #1869 · NVIDIA-NeMo/Guardrails

eeee2345 · 2026-05-09T08:22:34Z

This adds an example configuration under examples/configs/atr_threat_detection that demonstrates how to wire the built-in regex_detection input rail to a small set of patterns inspired by Agent Threat Rules. ATR is an open detection standard for AI agent threats published under the MIT license at https://github.com/Agent-Threat-Rule/agent-threat-rules.

The example covers six common attack categories using compact regex patterns: instruction override, system prompt exfiltration, role-play jailbreak, base64-wrapped payload hints, MCP tool override markers, and file:// SSRF references. Each entry in the config carries an ATR rule id comment so users can cross-reference the open ruleset.

The example uses only built-in components. config.yml configures regex_detection.input with the patterns and case_insensitive flag, the input rail invokes the existing regex check input flow, and rails.co defines the refusal message plus an optional flow that exposes the matched rule list for logging. There are no new dependencies, no external services, and no LLM calls in the rail itself.

The README explains what each pattern targets, shows a runnable nemoguardrails chat command, and points to the upstream ATR repository for users who want to load the full ruleset at startup.

This fills a gap in the existing examples folder, which has injection_detection (YARA-based) and regex (used in tests) but no agent-specific threat sample focused on prompt injection, MCP, and skill-style attacks.

Summary by CodeRabbit

New Features
- Added an ATR-inspired threat detection example with configuration for detecting and blocking various attack patterns including instruction overrides, prompt exfiltration, and jailbreak attempts.
Documentation
- Added comprehensive README explaining the example, how to run it, expected behavior, and how to extend it with additional detection patterns.

Adds an example configuration under examples/configs/atr_threat_detection that wires the built-in regex_detection input rail to a small set of ATR-inspired patterns covering instruction override, system prompt exfiltration, role-play jailbreak, base64-wrapped payload hints, MCP tool override markers, and file:// SSRF references. The full open detection set lives at https://github.com/Agent-Threat-Rule/agent-threat-rules under Apache-2.0. The example uses only built-in components (no new dependencies, no LLM calls in the rail itself) and includes a README with a runnable nemoguardrails chat command.

github-actions · 2026-05-09T08:24:44Z

Documentation preview

https://nvidia-nemo.github.io/Guardrails/review/pr-1869

coderabbitai · 2026-05-09T08:25:13Z

📝 Walkthrough

Walkthrough

This pull request introduces a complete ATR-inspired threat detection example for NeMo Guardrails, including configuration files, detection logic, and documentation. The example demonstrates regex-based detection of six threat categories with automatic refusal responses.

Changes

ATR Threat Detection Example

Layer / File(s)	Summary
Configuration Structure and Threat Patterns `examples/configs/atr_threat_detection/config/config.yml`	Empty models list defined; `regex_detection` rail configured with case-insensitive matching and six curated regex patterns targeting instruction override, prompt exfiltration, role-play jailbreak, base64 payloads, MCP/tool override, and `file://` scheme threats; rail input wired to `regex check input` flow.
Detection Response Flow and Bot `examples/configs/atr_threat_detection/config/rails.co`	Bot `refuse to respond` added to return a fixed refusal message; optional `atr report match` flow defined to asynchronously run `DetectRegexMatchAction` against input, capturing detections and aborting with refusal when match is detected, otherwise continuing.
Example Documentation `examples/configs/atr_threat_detection/README.md`	Documentation describes covered threat categories, command to run example (`nemoguardrails chat --config=examples/configs/atr_threat_detection/config`), expected refusal behavior on triggering input, and extension approach via live ATR YAML ruleset ingestion into `regex_detection.input` patterns.

🎯 2 (Simple) | ⏱️ ~8 minutes

🚥 Pre-merge checks | ✅ 6

✅ Passed checks (6 passed)

Check name	Status	Explanation
Title check	✅ Passed	The pull request title 'docs(library): add ATR-inspired threat detection example' directly and clearly summarizes the main change—adding an ATR-inspired threat detection example—which aligns with all file additions in the changeset.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Test Results For Major Changes	✅ Passed	PR adds example configuration and documentation only. These are minor changes. Test results not required for documentation and example additions.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (1)

examples/configs/atr_threat_detection/README.md (1)

37-41: ⚡ Quick win

Document how to enable the optional atr report match flow.

At Line 37 onward, the extension section would be stronger if it explicitly shows adding atr report match under rails.input.flows, so users can actually surface matched detections as described in rails.co.

✍️ Suggested doc patch

 ## Extending
 
 To run against the live ATR YAML ruleset, parse the rule files at startup
 and append the `detection.regex_patterns` field of each rule to the
 `patterns` list under `regex_detection.input`.
+
+To also expose matched detections, add the optional flow from `rails.co`
+to your input flows in `config/config.yml`:
+
+```yaml
+rails:
+  input:
+    flows:
+      - atr report match
+      - regex check input
+```

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@examples/configs/atr_threat_detection/README.md` around lines 37 - 41, Update
the "Extending" docs to show how to enable the optional "atr report match" flow
by adding an example rails.input.flows YAML stanza that includes "atr report
match" (and other flows like "regex check input"); also explicitly state to
parse the live ATR YAML ruleset at startup and append each rule's
detection.regex_patterns into the patterns list under regex_detection.input so
matched detections surface via the rails flow (refer to rails.input.flows, atr
report match, regex_detection.input, detection.regex_patterns, and patterns when
making the change).

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@examples/configs/atr_threat_detection/README.md`:
- Around line 37-41: Update the "Extending" docs to show how to enable the
optional "atr report match" flow by adding an example rails.input.flows YAML
stanza that includes "atr report match" (and other flows like "regex check
input"); also explicitly state to parse the live ATR YAML ruleset at startup and
append each rule's detection.regex_patterns into the patterns list under
regex_detection.input so matched detections surface via the rails flow (refer to
rails.input.flows, atr report match, regex_detection.input,
detection.regex_patterns, and patterns when making the change).

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 7bdd8e5b-4e1d-4b12-aa70-a31915585737

📥 Commits

Reviewing files that changed from the base of the PR and between a90ef1b and 60fa176.

📒 Files selected for processing (3)

examples/configs/atr_threat_detection/README.md
examples/configs/atr_threat_detection/config/config.yml
examples/configs/atr_threat_detection/config/rails.co

greptile-apps · 2026-05-09T08:25:50Z

Greptile Summary

This PR adds a new atr_threat_detection example under examples/configs/ that wires the built-in regex_detection input rail to six ATR-inspired patterns covering instruction override, prompt exfiltration, role-play jailbreak, base64 payload hints, MCP tool override markers, and file:// SSRF references. The example now configures a real main model (gpt-4o-mini) so nemoguardrails chat runs end-to-end, and the previously present rails.co file has been removed in favour of relying entirely on the library's built-in regex check input subflow.

config/config.yml: Correct YAML structure with case_insensitive: true and six regex patterns; wires regex check input as the sole input rail.
README.md: Explains the attack categories, how to run the demo, and includes an "Extending" section with a custom flow snippet that correctly follows the if $config.enable_rails_exceptions pattern from guardrails_only.

Confidence Score: 5/5

Safe to merge — adds a documentation-only example with no code changes to the core runtime.

The change is confined to two new example files with no impact on the runtime, library, or tests. The YAML config structure matches existing test fixtures. The README extending snippet has minor style nits but no logic that would break a user's implementation.

No files require special attention; both files are example/documentation only.

Important Files Changed

Filename	Overview
examples/configs/atr_threat_detection/README.md	Documentation for the ATR threat detection example; extending snippet uses `define flow` instead of the library-standard `define subflow`, and the audit-logging guidance slightly mischaracterises what `$result["detections"]` returns.
examples/configs/atr_threat_detection/config/config.yml	Correct YAML structure matching existing test fixtures; configures `gpt-4o-mini` as the main model, applies `case_insensitive: true`, and wires `regex check input` via `rails.input.flows`.

Sequence Diagram

sequenceDiagram
    participant User
    participant NeMoGuardrails
    participant RegexCheckInput as regex check input (subflow)
    participant DetectRegexPattern as detect_regex_pattern (action)
    participant LLM as Main LLM (gpt-4o-mini)

    User->>NeMoGuardrails: user message
    NeMoGuardrails->>RegexCheckInput: invoke input rail
    RegexCheckInput->>DetectRegexPattern: "execute(source=input, text=$user_message)"
    DetectRegexPattern-->>RegexCheckInput: "{is_match, text, detections}"
    alt "is_match == true"
        RegexCheckInput-->>NeMoGuardrails: bot refuse to respond + stop
        NeMoGuardrails-->>User: I'm sorry, I can't respond to that.
    else "is_match == false"
        RegexCheckInput-->>NeMoGuardrails: no match, continue
        NeMoGuardrails->>LLM: forward benign message
        LLM-->>NeMoGuardrails: response
        NeMoGuardrails-->>User: LLM response
    end

Prompt To Fix All With AI

Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
examples/configs/atr_threat_detection/README.md:63
Every input rail in the library and guardrails_only examples is defined with `define subflow` (`regex check input`, `dummy input rail`, etc.). Using `define flow` in the extending snippet is inconsistent with this pattern. In Colang 1.0, a bare `define flow` participates in context-based routing, which could interfere with how the engine selects flows. Using `define subflow` makes the definition an explicit subroutine that is only ever invoked directly by the rails system, matching the expected semantics.

```suggestion
define subflow atr report match
```

### Issue 2 of 2
examples/configs/atr_threat_detection/README.md:80-84
`$result["detections"]` returns the matched **regex pattern strings** (e.g., `"\\b(ignore|disregard|forget)\\s+..."`) not ATR rule IDs. A user implementing audit logging who reads "capture the matched rule list" will likely expect short identifiers like `["ATR-PI-001"]` but will instead receive the full raw pattern. The extending note should clarify that `detections` contains the pattern strings so readers know what format to expect when forwarding to a log sink or exception message.

_{Reviews (5): Last reviewed commit: "docs(example): fix Extending snippet to ..." | Re-trigger Greptile}

greptile-apps · 2026-05-09T08:25:53Z

+define flow atr report match
+  """Optional flow: log the matched ATR rule(s) when the input rail fires."""
+  $result = await DetectRegexMatchAction(source="input", text=$user_message)
+  if $result["is_match"]
+    $matched_rules = $result["detections"]
+    bot refuse to respond
+    abort


atr report match flow is never wired as an input rail

The flow is defined but never referenced in config.yml under rails.input.flows, so it will never be executed. The file comment and PR description describe it as exposing matched rules for logging, but there is no path that invokes it. Any user who copies this example will silently get no logging, with no indication the flow is dormant. Additionally, if a developer does add it to the flows list alongside regex check input, DetectRegexMatchAction will run twice on the same message, which is redundant and could cause a double-abort.

Prompt To Fix With AI

This is a comment left during a code review. Path: examples/configs/atr_threat_detection/config/rails.co Line: 11-17 Comment: **`atr report match` flow is never wired as an input rail** The flow is defined but never referenced in `config.yml` under `rails.input.flows`, so it will never be executed. The file comment and PR description describe it as exposing matched rules for logging, but there is no path that invokes it. Any user who copies this example will silently get no logging, with no indication the flow is dormant. Additionally, if a developer does add it to the flows list alongside `regex check input`, `DetectRegexMatchAction` will run twice on the same message, which is redundant and could cause a double-abort. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-05-09T08:25:54Z

+define flow atr report match
+  """Optional flow: log the matched ATR rule(s) when the input rail fires."""
+  $result = await DetectRegexMatchAction(source="input", text=$user_message)
+  if $result["is_match"]


$matched_rules is assigned but never used

$matched_rules = $result["detections"] stores the list of matched patterns in a local variable, but neither a log statement nor any other reference to that variable follows. The docstring says the flow "log[s] the matched ATR rule(s)" and the PR description says it "exposes the matched rule list for logging," but the actual logging is missing, making the promise in the comment inaccurate.

Prompt To Fix With AI

This is a comment left during a code review. Path: examples/configs/atr_threat_detection/config/rails.co Line: 14 Comment: **`$matched_rules` is assigned but never used** `$matched_rules = $result["detections"]` stores the list of matched patterns in a local variable, but neither a log statement nor any other reference to that variable follows. The docstring says the flow "log[s] the matched ATR rule(s)" and the PR description says it "exposes the matched rule list for logging," but the actual logging is missing, making the promise in the comment inaccurate. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-05-09T08:25:55Z

+models: []
+


models: [] causes failure for non-matching inputs in interactive chat

The README guides users to run nemoguardrails chat interactively. If a user types any message that does not match a threat pattern, the runtime will attempt to invoke the main LLM to generate a response and fail because no model is configured. The chat example only works end-to-end for inputs that trigger the rail. Either adding a brief note to the README or providing a minimal stub model entry would prevent user confusion.

Prompt To Fix With AI

This is a comment left during a code review. Path: examples/configs/atr_threat_detection/config/config.yml Line: 1-2 Comment: **`models: []` causes failure for non-matching inputs in interactive chat** The README guides users to run `nemoguardrails chat` interactively. If a user types any message that does *not* match a threat pattern, the runtime will attempt to invoke the main LLM to generate a response and fail because no model is configured. The chat example only works end-to-end for inputs that trigger the rail. Either adding a brief note to the README or providing a minimal stub model entry would prevent user confusion. How can I resolve this? If you propose a fix, please make it concise.

codecov · 2026-05-09T08:30:58Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

…itpick Adds a YAML stanza showing how to enable the optional 'atr report match' flow that already ships in rails.co, so users can surface matched rule identifiers instead of only the generic refusal. Order note clarifies why 'atr report match' must come before 'regex check input'.

greptile-apps · 2026-05-10T11:35:40Z

+define bot refuse to respond
+  "I'm sorry, your request matched a threat detection rule and was blocked."


bot refuse to respond redefinition creates non-deterministic output

The library file nemoguardrails/library/regex/flows.v1.co already defines bot refuse to respond as "I'm sorry, I can't respond to that.". Because the default Colang version is 1.0 (no colang_version key in config.yml) and both files are loaded at runtime, Colang 1.0 treats both strings as alternatives for the same utterance. With models: [] there is no LLM to arbitrate, so the runtime will fall back to one of the two strings non-deterministically — likely the library's generic message rather than the threat-specific one defined here. A developer following this example will not reliably see the custom refusal message.

Prompt To Fix With AI

This is a comment left during a code review. Path: examples/configs/atr_threat_detection/config/rails.co Line: 8-9 Comment: **`bot refuse to respond` redefinition creates non-deterministic output** The library file `nemoguardrails/library/regex/flows.v1.co` already defines `bot refuse to respond` as `"I'm sorry, I can't respond to that."`. Because the default Colang version is 1.0 (no `colang_version` key in `config.yml`) and both files are loaded at runtime, Colang 1.0 treats both strings as *alternatives* for the same utterance. With `models: []` there is no LLM to arbitrate, so the runtime will fall back to one of the two strings non-deterministically — likely the library's generic message rather than the threat-specific one defined here. A developer following this example will not reliably see the custom refusal message. How can I resolve this? If you propose a fix, please make it concise.

Address Greptile/coderabbit P1 + P2 findings from @NVIDIA-NeMo bot reviewers on this PR: 1. P1: `bot refuse to respond` redefinition in rails.co collided with the library default in nemoguardrails/library/regex/flows.v1.co. Under Colang 1.0 this made the refusal utterance non-deterministic with `models: []` and no LLM to arbitrate. Fix: delete rails.co entirely. The example now uses the library default refusal message ("I'm sorry, I can't respond to that."). 2. P2: `models: []` caused a runtime error in `nemoguardrails chat` for any benign user message (the runtime needs a main model when the input rail does not abort). Fix: add a main model stub (openai/gpt-4o-mini) so chat runs end-to-end. The input rail still blocks threats before the model is invoked, so the model only sees benign inputs. 3. P2: `atr report match` flow was defined in rails.co but never wired under rails.input.flows -- it was dormant code. Fix: removed (rails.co deleted). The README's Extending section now shows the custom-flow pattern with a non-conflicting bot utterance (`bot refuse atr_threat`) and a `AtrRuleMatchedRailException` event for downstream observers, so the documented pattern is correct. 4. P2: `$matched_rules = $result["detections"]` assigned but never referenced -- comment promised "log the matched ATR rule(s)" but no logging followed. Fix: removed (the dormant flow no longer exists). The Extending section's custom-flow example uses `$matched_rules` only to gate the event emission, and emits an `AtrRuleMatchedRailException` so downstream code can subscribe to it. 5. Documentation correction: README and config.yml both cited ATR as Apache-2.0 -- the actual license is MIT. Corrected both references. Net diff: - config.yml: add main model stub, fix license comment. - rails.co: removed (used library default refusal). - README.md: fix license, update behavior description, replace stale "atr report match" wiring instructions with a correct custom-flow example in the Extending section. Tests pass locally (no test files in this PR; existing pr-tests-matrix green on 3.10-3.13). Signed-off-by: Adam Lin <adam@agentthreatrule.org>

eeee2345 · 2026-05-17T22:57:37Z

@coderabbitai @greptile-apps — pushed 813f41c addressing the four findings on this PR:

P1 — bot refuse to respond redefinition vs library default in nemoguardrails/library/regex/flows.v1.co. The PR now deletes rails.co entirely and uses the library-default refusal message. No more redefinition, no Colang 1.0 alternative-utterance ambiguity, no dependency on a model for arbitration.

P2 — models: [] causing a runtime error on benign inputs during interactive chat. Added a main model stub (openai/gpt-4o-mini) with a comment that the user can replace with their preferred provider. Threats are blocked by the input rail before the model is invoked.

P2 — atr report match flow defined but never wired under rails.input.flows. Dormant code is now removed. The README's Extending section shows the correct pattern (custom flow with a non-conflicting bot refuse atr_threat utterance plus an AtrRuleMatchedRailException event) so anyone who wants the rule-id surface can wire it correctly.

P2 — $matched_rules assigned but never used. The dormant flow is removed, so the variable is no longer orphaned. The Extending-section custom flow uses it to gate the create event so downstream observers (audit logging, metrics) receive the exception.

Documentation correction — README and config.yml both cited ATR as Apache-2.0. The actual license is MIT (LICENSE file + package.json). Fixed both references.

Net diff: 3 files changed, +41 / -36 (rails.co deleted). All pr-tests-matrix checks should remain green; no test code in this PR.

coderabbitai · 2026-05-17T22:58:07Z

Tip

For best results, initiate chat on the files or code changes.

@eeee2345 — excellent cleanup in commit 813f41c. All four findings are addressed cleanly:

P1 — bot refuse to respond redefinition ✅
Deleting rails.co entirely is the right call. The library default in nemoguardrails/library/regex/flows.v1.co takes over with no ambiguity and no model arbitration needed.

P2 — models: [] runtime error ✅
Adding the openai/gpt-4o-mini stub with a replacement comment is the correct minimal fix — the input rail blocks threats before the model is ever reached, and the comment makes the intent clear for adopters.

P2 — atr report match dormant flow ✅
Removing the unwired flow and relocating the pattern to the README's Extending section (with the non-conflicting bot refuse atr_threat utterance + AtrRuleMatchedRailException event) is a much cleaner separation between the out-of-the-box example and the opt-in advanced case.

P2 — $matched_rules orphan ✅
Resolved as a natural consequence of removing the dormant flow.

License correction ✅
The ATR repository at Agent-Threat-Rule/agent-threat-rules is indeed MIT licensed, so fixing both README.md and config.yml from Apache-2.0 → MIT is correct.

The net diff of +41/−36 with rails.co deleted looks well-scoped. No further concerns from my side.

🐇 ✨

greptile-apps · 2026-05-17T23:00:05Z

+
+This example shows how to use the built-in `regex_detection` input rail
+with a small set of patterns inspired by Agent Threat Rules, an open
+detection standard for AI agent threats published under the MIT license:


The README states the ATR project is published under the MIT license, but the PR description says it is Apache-2.0. A user relying on this file to assess license compatibility for their project will get incorrect information.

Suggested change

detection standard for AI agent threats published under the MIT license:

detection standard for AI agent threats published under the Apache-2.0 license:

Prompt To Fix With AI

This is a comment left during a code review. Path: examples/configs/atr_threat_detection/README.md Line: 5 Comment: The README states the ATR project is published under the MIT license, but the PR description says it is Apache-2.0. A user relying on this file to assess license compatibility for their project will get incorrect information. ```suggestion detection standard for AI agent threats published under the Apache-2.0 license: ``` How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-05-17T23:02:52Z

+```colang
+define bot refuse atr_threat
+  "I'm sorry, that request was blocked by an ATR input safety rule."
+
+define flow atr report match
+  $result = execute detect_regex_pattern(source="input", text=$user_message)
+  if $result["is_match"]
+    $matched_rules = $result["detections"]
+    create event AtrRuleMatchedRailException(message="ATR input rail blocked")
+    bot refuse atr_threat
+    stop
+```


Extending snippet silently drops the custom refusal message

When both create event AtrRuleMatchedRailException and bot refuse atr_threat fire in the same flow, the Colang 1.0 runtime's event loop (see llmrails.py lines 946–954) appends the bot script to responses but also sets exception = event. The subsequent branch at line 986 (if exception: new_message = {"role": "exception", ...}) short-circuits and never uses responses, so the caller receives an exception object instead of the custom refusal text.

The established pattern (e.g. guardrails_only/input/config.co) guards on $config.enable_rails_exceptions and uses an if/else to emit either the exception event or the bot utterance — not both unconditionally.

Prompt To Fix With AI

This is a comment left during a code review. Path: examples/configs/atr_threat_detection/README.md Line: 53-64 Comment: **Extending snippet silently drops the custom refusal message** When both `create event AtrRuleMatchedRailException` and `bot refuse atr_threat` fire in the same flow, the Colang 1.0 runtime's event loop (see `llmrails.py` lines 946–954) appends the bot script to `responses` but also sets `exception = event`. The subsequent branch at line 986 (`if exception: new_message = {"role": "exception", ...}`) short-circuits and never uses `responses`, so the caller receives an exception object instead of the custom refusal text. The established pattern (e.g. `guardrails_only/input/config.co`) guards on `$config.enable_rails_exceptions` and uses an `if/else` to emit **either** the exception event **or** the bot utterance — not both unconditionally. How can I resolve this? If you propose a fix, please make it concise.

Address greptile P1 follow-up on the README's Extending section: In Colang 1.0, the rails event loop appends bot utterances to `responses` but the subsequent branch that handles `exception = event` short-circuits and never emits `responses`, so combining `create event ...RailException` and `bot refuse atr_threat` in the same flow silently drops the refusal. The canonical pattern (e.g. examples/configs/guardrails_only/input/ config.co's `dummy input rail`) gates on `$config.enable_rails_exceptions` and uses an `if/else` to emit **either** the exception event **or** the bot utterance. Updated the README Extending snippet to follow that pattern, with an explicit note about the dropped-utterance behavior so future readers do not repeat the mistake. Also added a short paragraph explaining how to capture `$matched_rules` for downstream audit without breaking the canonical single-action branches. PR description: corrected from "Apache-2.0" to "MIT license" so it matches the README (ATR is MIT-licensed per LICENSE and package.json). Signed-off-by: Adam Lin <adam@agentthreatrule.org>

eeee2345 · 2026-05-17T23:18:42Z

@greptile-apps — pushed 29beb3b addressing both follow-up P1 findings:

P1 #1 (license mismatch): The README is correct — ATR is MIT-licensed (LICENSE + package.json in https://github.com/Agent-Threat-Rule/agent-threat-rules confirm). The PR description was the stale half. I updated the PR description from "Apache-2.0" to "MIT license" so the two now match.

P1 #2 (Extending snippet dropping the bot refuse): Good catch on the Colang 1.0 event-loop behavior — combining create event ...RailException and bot refuse in the same flow silently drops the bot utterance from responses because the exception branch short-circuits. The Extending snippet now follows the canonical if $config.enable_rails_exceptions pattern from examples/configs/guardrails_only/input/config.co so the flow emits one path or the other, not both:

define flow atr report match
  $result = execute detect_regex_pattern(source="input", text=$user_message)
  if $result["is_match"]
    if $config.enable_rails_exceptions
      create event AtrRuleMatchedRailException(message="ATR input rail blocked")
    else
      bot refuse atr_threat
    stop

Added an explicit note about the dropped-utterance behavior so future readers do not repeat the mistake, plus a short paragraph on how to capture $matched_rules for downstream audit without breaking the canonical single-action branches.

greptile-apps · 2026-05-17T23:19:17Z

Both fixes are solid. The Colang snippet now correctly mirrors the dummy input rail canonical pattern from guardrails_only/input/config.co — if/else branches with a single stop, no mixed create event + bot refuse in the same path. The explanatory note about why you can't combine them is a good addition for future readers.

One minor nit on the snippet: the define flow keyword should be define subflow to match the canonical pattern (and how the library's own input rails are structured). define flow works but is the top-level variant; define subflow is the idiomatic choice for a reusable rail check that gets called from the input rail entry point.

define subflow atr report match

That said, the logic and the structural fix for the P1 findings are correct. The license alignment (MIT) and the event-loop behavior are both handled properly. This is good to merge once that keyword is addressed (or if you intentionally prefer flow here for a specific wiring reason, a brief comment would clarify).

Tip: You can customize Greptile's behavior for this repo with .greptile/rules.md and .greptile/config.json.

coderabbitai Bot reviewed May 9, 2026

View reviewed changes

greptile-apps Bot reviewed May 9, 2026

View reviewed changes

greptile-apps Bot reviewed May 10, 2026

View reviewed changes

greptile-apps Bot reviewed May 17, 2026

View reviewed changes

		define bot refuse to respond
		"I'm sorry, your request matched a threat detection rule and was blocked."

	detection standard for AI agent threats published under the MIT license:
	detection standard for AI agent threats published under the Apache-2.0 license:

		models: []

Conversation

eeee2345 commented May 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

github-actions Bot commented May 9, 2026

Documentation preview

Uh oh!

coderabbitai Bot commented May 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot commented May 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps Bot May 9, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot May 9, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot May 9, 2026

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented May 9, 2026

Codecov Report

Uh oh!

greptile-apps Bot May 10, 2026

Choose a reason for hiding this comment

Uh oh!

eeee2345 commented May 17, 2026

Uh oh!

coderabbitai Bot commented May 17, 2026

Uh oh!

greptile-apps Bot May 17, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot May 17, 2026

Choose a reason for hiding this comment

Uh oh!

eeee2345 commented May 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps Bot commented May 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

eeee2345 commented May 9, 2026 •

edited

Loading

coderabbitai Bot commented May 9, 2026 •

edited

Loading

greptile-apps Bot commented May 9, 2026 •

edited

Loading

eeee2345 commented May 17, 2026 •

edited

Loading