fix(sdk): abort run loop on repeated Stop-hook denials without progress by VascoSch92 · Pull Request #3183 · OpenHands/software-agent-sdk

VascoSch92 · 2026-05-09T20:50:08Z

A human has tested these changes.

Why

https://openhands-ai.slack.com/archives/C06R25BT5B2/p1778340691700839

Summary

Adds max_consecutive_stop_denials (default 5) to LocalConversation. When a Stop hook denies repeatedly with no ActionEvent between denials, the run loop now aborts with a ConversationErrorEvent(code="StopHookLoopDetected") and
ERROR status instead of looping until max_iteration_per_run (default 500) is exhausted.
The streak counter resets whenever the agent emits an ActionEvent between two denials, so legitimate iterative work keeps running.

Surfaced from a trajectory where a misconfigured .openhands/hooks/on_stop.sh (running npm run lint against an env missing dev deps) returned decision: deny on every Stop event. The agent recognized the hook was broken and emitted
text-only "I'm done" turns, but each one re-triggered the Stop hook — 56 cycles before the user manually paused, and nothing tool-actionable for the agent to do. Without this guard the loop would have run to the 500-iteration cap.

Type

Agent Server images for this PR

• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant	Architectures	Base Image	Docs / Tags
java	amd64, arm64	`eclipse-temurin:17-jdk`	Link
python	amd64, arm64	`nikolaik/python-nodejs:python3.13-nodejs22-slim`	Link
golang	amd64, arm64	`golang:1.21-bookworm`	Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:7f8ae29-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-7f8ae29-python \
  ghcr.io/openhands/agent-server:7f8ae29-python

All tags pushed for this build

ghcr.io/openhands/agent-server:7f8ae29-golang-amd64
ghcr.io/openhands/agent-server:7f8ae299c32960a1371731c11ace0bb9720d7c29-golang-amd64
ghcr.io/openhands/agent-server:vasco-fix-hooks-golang-amd64
ghcr.io/openhands/agent-server:7f8ae29-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:7f8ae29-golang-arm64
ghcr.io/openhands/agent-server:7f8ae299c32960a1371731c11ace0bb9720d7c29-golang-arm64
ghcr.io/openhands/agent-server:vasco-fix-hooks-golang-arm64
ghcr.io/openhands/agent-server:7f8ae29-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:7f8ae29-java-amd64
ghcr.io/openhands/agent-server:7f8ae299c32960a1371731c11ace0bb9720d7c29-java-amd64
ghcr.io/openhands/agent-server:vasco-fix-hooks-java-amd64
ghcr.io/openhands/agent-server:7f8ae29-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:7f8ae29-java-arm64
ghcr.io/openhands/agent-server:7f8ae299c32960a1371731c11ace0bb9720d7c29-java-arm64
ghcr.io/openhands/agent-server:vasco-fix-hooks-java-arm64
ghcr.io/openhands/agent-server:7f8ae29-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:7f8ae29-python-amd64
ghcr.io/openhands/agent-server:7f8ae299c32960a1371731c11ace0bb9720d7c29-python-amd64
ghcr.io/openhands/agent-server:vasco-fix-hooks-python-amd64
ghcr.io/openhands/agent-server:7f8ae29-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-amd64
ghcr.io/openhands/agent-server:7f8ae29-python-arm64
ghcr.io/openhands/agent-server:7f8ae299c32960a1371731c11ace0bb9720d7c29-python-arm64
ghcr.io/openhands/agent-server:vasco-fix-hooks-python-arm64
ghcr.io/openhands/agent-server:7f8ae29-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-arm64
ghcr.io/openhands/agent-server:7f8ae29-golang
ghcr.io/openhands/agent-server:7f8ae299c32960a1371731c11ace0bb9720d7c29-golang
ghcr.io/openhands/agent-server:vasco-fix-hooks-golang
ghcr.io/openhands/agent-server:7f8ae29-golang_tag_1.21-bookworm
ghcr.io/openhands/agent-server:7f8ae29-java
ghcr.io/openhands/agent-server:7f8ae299c32960a1371731c11ace0bb9720d7c29-java
ghcr.io/openhands/agent-server:vasco-fix-hooks-java
ghcr.io/openhands/agent-server:7f8ae29-eclipse-temurin_tag_17-jdk
ghcr.io/openhands/agent-server:7f8ae29-python
ghcr.io/openhands/agent-server:7f8ae299c32960a1371731c11ace0bb9720d7c29-python
ghcr.io/openhands/agent-server:vasco-fix-hooks-python
ghcr.io/openhands/agent-server:7f8ae29-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim

About Multi-Architecture Support

Each variant tag (e.g., 7f8ae29-python) is a multi-arch manifest supporting both amd64 and arm64
Docker automatically pulls the correct architecture for your platform
Individual architecture tags (e.g., 7f8ae29-python-amd64) are also available if needed

github-actions · 2026-05-09T20:50:35Z

Python API breakage checks — ✅ PASSED

Result: ✅ PASSED

Action log

github-actions · 2026-05-09T20:50:47Z

REST API breakage checks (OpenAPI) — ✅ PASSED

Result: ✅ PASSED

Action log

github-actions · 2026-05-09T20:52:38Z

Coverage Report •

File	Stmts	Miss	Cover	Missing
openhands-sdk/openhands/sdk/conversation/impl
local_conversation.py	463	25	94%	303, 308, 440, 486, 523, 539, 604, 860–861, 864, 1024, 1032, 1034, 1038–1039, 1050, 1052–1054, 1079, 1274, 1278, 1348, 1355–1356
TOTAL	25993	7431	71%

all-hands-bot

Taste Rating: 🟡 Acceptable - Pragmatic fix for a real production issue. The critical off-by-one bugs from previous reviews have been addressed.

[EVALUATION REQUIRED]

This PR changes agent loop termination logic, which falls into the eval-risk category per repository guidelines. Before approval, this needs:

Eval evidence: Run a lightweight benchmark (e.g., subset of SWE-bench) to verify the change doesn't regress agent performance
Human review: A maintainer should confirm eval results from the monitor

The change is defensive and only triggers in misconfigured scenarios (broken Stop hooks), so normal operation should be unaffected. However, any loop logic change warrants verification.

[POSITIVE NOTES]

✅ Critical fixes verified:

Off-by-one error fixed: now uses >= instead of >
Test assertion corrected: expects max_denials not max_denials + 1
Behavior now matches documentation

✅ Code quality:

Concurrency properly handled (state access within with self._state: lock)
Clear error messages with context (count, limit, feedback)
Comprehensive test coverage (abort case + streak reset case)
Clean integration with existing hook system

[RISK ASSESSMENT]

⚠️ Risk Level: 🟡 MEDIUM

Rationale:

➕ Fixes real production issue (infinite loop on misconfigured hooks)
➕ Only affects edge case (repeated hook denials)
➕ Improves user experience (clear error vs. hitting iteration cap)
⚠️ Changes loop termination logic (requires eval validation)
⚠️ New code path in critical run() method

Recommendation: Request lightweight eval run before merging. The code itself is solid, but eval validation is standard practice for loop logic changes per repo policy.

VERDICT:

⏸️ Needs eval validation before approval - Core logic is sound and previous critical issues are resolved. Once eval results confirm no performance regression, this is ready to merge.

all-hands-bot

Taste Rating: 🟡 Acceptable - Pragmatic fix for a real production issue. Previous critical bugs have been addressed.

Previous Critical Issues - RESOLVED ✅
The off-by-one error and test assertion issues from previous reviews have been fixed correctly.

[EVALUATION REQUIRED]

This PR changes agent loop termination logic, which falls into the eval-risk category per repository guidelines. Before approval, this needs:

Eval evidence: Run a lightweight benchmark (e.g., subset of SWE-bench) to verify the change doesn't regress agent performance
Human review: A maintainer should confirm eval results showing no unexpected impact on agent decision-making or task completion rates

The QA report validates the feature works as designed, but eval validation is required for changes affecting agent loop behavior.

[RISK ASSESSMENT]

[Overall PR] ⚠️ Risk Assessment: 🟡 MEDIUM

Changes agent execution loop termination conditions. While this addresses a real production issue (Stop hook denial loops), any modification to loop control logic requires eval validation to ensure no unintended impact on benchmark performance. The streak reset logic is sound and preserves legitimate iterative work.

VERDICT:
⏸️ Needs eval validation: Code quality is good, logic is correct, but eval evidence is required per repository policy before approval.

KEY INSIGHT:
The fix properly prevents infinite loops while preserving agent's ability to iterate on work, but needs benchmark validation to confirm no regression in agent effectiveness.

all-hands-bot

✅ QA Report: PASS

This PR successfully prevents infinite loops when a stop hook denies repeatedly without progress, while preserving legitimate iterative workflows.

Does this PR achieve its stated goal?

Yes. The PR set out to fix an infinite loop caused by misconfigured stop hooks that deny repeatedly without the agent making progress. I verified that:

The agent now aborts after exactly 5 consecutive stop hook denials (configurable via max_consecutive_stop_denials)
The loop emits a ConversationErrorEvent with code StopHookLoopDetected and sets status to ERROR
The streak counter correctly resets when the agent emits ActionEvents between denials, allowing legitimate iterative work to continue
The parameter is configurable and properly propagated to forked conversations

Phase	Result
Environment Setup	✅ Dependencies installed, SDK v1.21.1 ready
CI Status	⚠️ Most checks passing (sdk-tests, agent-server-tests, workspace-tests, tools-tests, pre-commit); 1 unrelated check failing (unresolved-review-threads); 1 pending (python-amd64 build)
Functional Verification	✅ All 4 scenarios verified end-to-end

Functional Verification

Test 1: Always-denying stop hook aborts after 5 denials

Setup: Created a stop hook script that always returns {"decision": "deny", "reason": "always denying for testing"} and configured a LocalConversation with max_consecutive_stop_denials=5.

Step 1 — Baseline (without fix):
Without this PR, such a misconfigured hook would loop 500 times (the max_iteration_per_run limit) before stopping, as described in the PR motivation from the Slack thread.

Step 2 — Apply PR changes:
Checked out branch vasco/fix-hooks (commit a9e46642).

Step 3 — Run with fix in place:
Ran verification script:

# Created a hook that always denies
# Set max_consecutive_stop_denials=5
conversation.send_message("Hello")
conversation.run()

Output:

Final execution status: ConversationExecutionStatus.ERROR
Number of error events: 1
Error code: StopHookLoopDetected
Error detail: Stop hook denied 5 times without progress (limit 5). Last feedback: always denying for testing
Number of stop hook denials: 5

✅ PASSED: Execution status is ERROR
✅ PASSED: Error event has correct code 'StopHookLoopDetected'
✅ PASSED: Exactly 5 stop hook denials occurred before abort

Interpretation: The agent correctly aborts after exactly 5 consecutive denials, emits the expected error event, and transitions to ERROR status instead of looping indefinitely. This confirms the fix works.

Test 2: Streak resets on progress (ActionEvent emission)

Setup: Created a hook that denies 4 times then allows, with an agent that emits an ActionEvent on the 3rd step (between denials 2 and 3). Set max_consecutive_stop_denials=3 to ensure the streak reset is required for the test to pass.

Expected behavior: Without the streak reset, the agent would abort after 3 consecutive denials. With the reset, the ActionEvent on step 3 should reset the counter, allowing all 4 denials to occur before the hook finally allows the stop.

Output:

Final execution status: ConversationExecutionStatus.FINISHED
Total steps: 5
Number of error events: 0
Number of stop hook denials: 4
Number of action events (progress): 1

✅ PASSED: Did not abort (streak reset worked)
✅ PASSED: No error events (no abort)
✅ PASSED: Agent made progress (1 action event(s))
✅ PASSED: Conversation completed successfully

Interpretation: The agent did NOT abort despite encountering 4 denials with a limit of 3, because the ActionEvent emitted on step 3 reset the consecutive denial counter. This confirms that legitimate iterative work (where the agent makes progress between denials) continues uninterrupted.

Test 3: Configurable limit parameter

Setup: Created a conversation with max_consecutive_stop_denials=2 (custom value) and an always-denying hook.

Output:

Custom limit set to: 2
Number of denials before abort: 2
✅ PASSED: Custom limit (2) was respected

Interpretation: The configurable parameter works correctly — the agent aborted after exactly 2 denials instead of the default 5.

Test 4: Parameter propagated to forked conversations

Setup: Created a conversation with max_consecutive_stop_denials=7, then forked it.

Output:

Original conversation max_consecutive_stop_denials: 7
Forked conversation max_consecutive_stop_denials: 7
✅ PASSED: Parameter propagated to forked conversation

Interpretation: The parameter is correctly propagated when forking conversations (line 383 in the diff confirms this).

Issues Found

None.

fix(sdk): abort run loop on repeated Stop-hook denials without progress

9a8c186

VascoSch92 requested a review from all-hands-bot May 9, 2026 20:50

This comment was marked as outdated.

Sign in to view

refactor(test): imports at the top

3aa9a94

This comment was marked as outdated.

Sign in to view

VascoSch92 and others added 2 commits May 9, 2026 22:59

fix(sdk): correct off-by-one and preserve param order in stop-hook guard

4ea4e85

Merge branch 'main' into vasco/fix-hooks

a9e4664

VascoSch92 requested a review from all-hands-bot May 9, 2026 21:00

all-hands-bot reviewed May 9, 2026

View reviewed changes

VascoSch92 marked this pull request as ready for review May 9, 2026 21:04

all-hands-bot reviewed May 9, 2026

View reviewed changes

Merge branch 'main' into vasco/fix-hooks

7f8ae29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(sdk): abort run loop on repeated Stop-hook denials without progress#3183

fix(sdk): abort run loop on repeated Stop-hook denials without progress#3183
VascoSch92 wants to merge 5 commits intomainfrom
vasco/fix-hooks

VascoSch92 commented May 9, 2026 •

edited by github-actions Bot

Loading

Uh oh!

github-actions Bot commented May 9, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 9, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 9, 2026 •

edited

Loading

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

all-hands-bot left a comment

Uh oh!

all-hands-bot left a comment

Uh oh!

all-hands-bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

VascoSch92 commented May 9, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why

Summary

Type

Uh oh!

github-actions Bot commented May 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Python API breakage checks — ✅ PASSED

Uh oh!

github-actions Bot commented May 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

REST API breakage checks (OpenAPI) — ✅ PASSED

Uh oh!

github-actions Bot commented May 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

✅ QA Report: PASS

Does this PR achieve its stated goal?

Test 1: Always-denying stop hook aborts after 5 denials

Test 2: Streak resets on progress (ActionEvent emission)

Test 3: Configurable limit parameter

Test 4: Parameter propagated to forked conversations

Issues Found

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

VascoSch92 commented May 9, 2026 •

edited by github-actions Bot

Loading

github-actions Bot commented May 9, 2026 •

edited

Loading

github-actions Bot commented May 9, 2026 •

edited

Loading

github-actions Bot commented May 9, 2026 •

edited

Loading