Skip to content

perf: batch ConversationState autosaves inside context manager#3165

Open
csmith49 wants to merge 3 commits into
mainfrom
fix/3145-state-autosave-debounce
Open

perf: batch ConversationState autosaves inside context manager#3165
csmith49 wants to merge 3 commits into
mainfrom
fix/3145-state-autosave-debounce

Conversation

@csmith49
Copy link
Copy Markdown
Collaborator

@csmith49 csmith49 commented May 8, 2026

Summary

ConversationState.__setattr__ triggered a full JSON serialization + synchronous file write on every public-field mutation. During a single agent step, multiple fields are updated (execution_status, stats, max_iterations, etc.), producing 2–4 redundant writes per step — each serializing the entire state including large nested objects like agent, stats, and workspace.

Fix

Extend the existing with state: context-manager protocol (currently just a FIFO lock) to also batch saves:

Region Behavior
Inside with state: Mutations set _dirty = True; one save on __exit__
Outside with state: Immediate save per mutation (backward compat)

Implementation:

  • __enter__ increments _save_depth
  • __setattr__ checks _save_depth > 0 → sets _dirty instead of saving
  • __exit__ decrements _save_depth; when it reaches 0 and _dirty, flush one save, then release lock
  • State-change callbacks still fire eagerly on each mutation (WebSocket clients see intermediate updates)
  • Nesting-safe via depth counter

I/O reduction per step

Before After
2–4 model_dump_json() + open()/write() per step 1 per with block
3 saves on conversation resume 1 save on resume

Verification

  • All 550 tests pass (477 conversation + 73 event service)
  • All pre-commit hooks pass (ruff, pyright, etc.)
  • Added test_context_manager_batches_saves — proves 3 mutations → 1 save inside with, and immediate save outside

Fixes #3145

This PR was created by an AI agent (OpenHands) on behalf of @csmith49.


Agent Server images for this PR

GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant Architectures Base Image Docs / Tags
java amd64, arm64 eclipse-temurin:17-jdk Link
python amd64, arm64 nikolaik/python-nodejs:python3.13-nodejs22-slim Link
golang amd64, arm64 golang:1.21-bookworm Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:4a3d894-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-4a3d894-python \
  ghcr.io/openhands/agent-server:4a3d894-python

All tags pushed for this build

ghcr.io/openhands/agent-server:4a3d894-golang-amd64
ghcr.io/openhands/agent-server:4a3d894e36e041a84aa0f3bcf22b954db99dd212-golang-amd64
ghcr.io/openhands/agent-server:fix-3145-state-autosave-debounce-golang-amd64
ghcr.io/openhands/agent-server:4a3d894-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:4a3d894-golang-arm64
ghcr.io/openhands/agent-server:4a3d894e36e041a84aa0f3bcf22b954db99dd212-golang-arm64
ghcr.io/openhands/agent-server:fix-3145-state-autosave-debounce-golang-arm64
ghcr.io/openhands/agent-server:4a3d894-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:4a3d894-java-amd64
ghcr.io/openhands/agent-server:4a3d894e36e041a84aa0f3bcf22b954db99dd212-java-amd64
ghcr.io/openhands/agent-server:fix-3145-state-autosave-debounce-java-amd64
ghcr.io/openhands/agent-server:4a3d894-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:4a3d894-java-arm64
ghcr.io/openhands/agent-server:4a3d894e36e041a84aa0f3bcf22b954db99dd212-java-arm64
ghcr.io/openhands/agent-server:fix-3145-state-autosave-debounce-java-arm64
ghcr.io/openhands/agent-server:4a3d894-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:4a3d894-python-amd64
ghcr.io/openhands/agent-server:4a3d894e36e041a84aa0f3bcf22b954db99dd212-python-amd64
ghcr.io/openhands/agent-server:fix-3145-state-autosave-debounce-python-amd64
ghcr.io/openhands/agent-server:4a3d894-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-amd64
ghcr.io/openhands/agent-server:4a3d894-python-arm64
ghcr.io/openhands/agent-server:4a3d894e36e041a84aa0f3bcf22b954db99dd212-python-arm64
ghcr.io/openhands/agent-server:fix-3145-state-autosave-debounce-python-arm64
ghcr.io/openhands/agent-server:4a3d894-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-arm64
ghcr.io/openhands/agent-server:4a3d894-golang
ghcr.io/openhands/agent-server:4a3d894e36e041a84aa0f3bcf22b954db99dd212-golang
ghcr.io/openhands/agent-server:fix-3145-state-autosave-debounce-golang
ghcr.io/openhands/agent-server:4a3d894-golang_tag_1.21-bookworm
ghcr.io/openhands/agent-server:4a3d894-java
ghcr.io/openhands/agent-server:4a3d894e36e041a84aa0f3bcf22b954db99dd212-java
ghcr.io/openhands/agent-server:fix-3145-state-autosave-debounce-java
ghcr.io/openhands/agent-server:4a3d894-eclipse-temurin_tag_17-jdk
ghcr.io/openhands/agent-server:4a3d894-python
ghcr.io/openhands/agent-server:4a3d894e36e041a84aa0f3bcf22b954db99dd212-python
ghcr.io/openhands/agent-server:fix-3145-state-autosave-debounce-python
ghcr.io/openhands/agent-server:4a3d894-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim

About Multi-Architecture Support

  • Each variant tag (e.g., 4a3d894-python) is a multi-arch manifest supporting both amd64 and arm64
  • Docker automatically pulls the correct architecture for your platform
  • Individual architecture tags (e.g., 4a3d894-python-amd64) are also available if needed

ConversationState.__setattr__ triggered a full JSON serialization and
synchronous file write on every public-field mutation.  During a single
agent step multiple fields are updated (execution_status, stats, etc.),
producing 2-4 redundant writes per step.

Add save-depth tracking to the existing context-manager protocol:
  - __enter__ increments _save_depth; __setattr__ sets _dirty instead
    of saving immediately while _save_depth > 0.
  - __exit__ decrements _save_depth; when it reaches 0 and _dirty is
    set, one batched save is flushed before releasing the lock.
  - Outside any context-manager block, saves remain immediate for
    backward compatibility.

State-change callbacks still fire eagerly on each mutation so
WebSocket clients see intermediate updates without waiting for the
block to exit.

Fixes #3145

Co-authored-by: openhands <openhands@all-hands.dev>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 8, 2026

Python API breakage checks — ✅ PASSED

Result:PASSED

Action log

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 8, 2026

REST API breakage checks (OpenAPI) — ✅ PASSED

Result:PASSED

Action log

Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Good taste - Elegant I/O batching solution that eliminates redundant saves (2-4 per step → 1 per block) without breaking backward compatibility. The depth-counter pattern handles nesting correctly, callbacks still fire eagerly, and the lock is properly managed. Clean implementation with solid test coverage. LGTM! ✅

[RISK ASSESSMENT]
⚠️ Risk Assessment: 🟢 LOW

Pure I/O optimization with no agent behavior changes. Maintains backward compatibility (mutations outside context managers still save immediately), preserves eager callback firing, and has comprehensive test coverage proving both the optimization and backward compat work correctly.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 8, 2026

Coverage

Coverage Report •
FileStmtsMissCoverMissing
openhands-sdk/openhands/sdk/conversation
   state.py203597%379, 431–433, 574
TOTAL26379759871% 

Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ QA Report: PASS

The PR successfully reduces I/O operations by batching ConversationState autosaves, achieving 67% reduction in save operations (from 3→1 for typical agent steps).

Does this PR achieve its stated goal?

Yes. The PR delivers exactly what it promises: multiple field mutations inside with state: now produce a single batched save at exit instead of 2-4 individual saves. Backward compatibility is preserved (mutations outside context managers still save immediately), callbacks fire eagerly (WebSocket clients see intermediate updates), and nested contexts work correctly via depth tracking.

Phase Result
Environment Setup make build completed successfully
CI Status ✅ All functional checks passing (sdk-tests, cross-tests, windows-tests, agent-server-tests, tools-tests, workspace-tests, pre-commit) — only Docker builds pending
Functional Verification ✅ Save batching verified with before/after comparison, nested contexts tested, callbacks confirmed eager
Functional Verification

Test 1: Save Batching — Before/After Comparison

Step 1 — Establish baseline (main branch without fix):

Checked out main and ran test with 3 field mutations inside with state: block:

============================================================
TEST 1: Multiple mutations INSIDE 'with state:' block
============================================================
Setting execution_status...
  → Save #1 triggered
Setting max_iterations...
  → Save #2 triggered
Setting stuck_detection...
  → Save #3 triggered
Exiting context manager...

✓ Total saves inside 'with' block: 3

This confirms the problem: 3 separate saves (one per field mutation) inside the context manager on main branch.

Step 2 — Apply the PR's changes:

Checked out branch fix/3145-state-autosave-debounce.

Step 3 — Re-run with the fix in place:

Ran the same test on the PR branch:

============================================================
TEST 1: Multiple mutations INSIDE 'with state:' block
============================================================
Setting execution_status...
Setting max_iterations...
Setting stuck_detection...
Exiting context manager...
  → Save #1 triggered

✓ Total saves inside 'with' block: 1

This confirms the fix works: 1 batched save at context manager exit instead of 3 separate saves. 67% reduction in I/O operations.


Test 2: Backward Compatibility

Verified mutations outside with state: still save immediately:

============================================================
TEST 2: Single mutation OUTSIDE 'with state:' block
============================================================
Setting max_iterations...
  → Save #1 triggered

✓ Total saves outside 'with' block: 1

Result: Immediate save behavior preserved for non-context-manager usage.


Test 3: Nested Context Managers

Tested depth tracking with nested with state: blocks:

============================================================
TEST: Nested context managers
============================================================
Entering outer context...
  Setting execution_status in outer context...
  Entering inner context...
    Setting max_iterations in inner context...
    Setting stuck_detection in inner context...
  Exiting inner context...
  After inner context, saves so far: 0
  Setting max_iterations in outer context...
Exiting outer context...
  → Save #1 triggered

✓ Total saves with nested contexts: 1

Result: Depth counter works correctly — all mutations batched until outermost context exits.


Test 4: Eager Callback Execution

Verified state change callbacks fire immediately (not batched):

============================================================
TEST: Callbacks fire eagerly (not batched)
============================================================
Mutations inside 'with state:' block:
  Setting execution_status...
  → Callback #1 fired
  Setting max_iterations...
  → Callback #2 fired
  Setting stuck_detection...
  → Callback #3 fired
  Exiting context manager...
  → Save #1 triggered

✓ Callbacks fired: 3 (expected: 3, one per mutation)
✓ Saves executed: 1 (expected: 1, batched at exit)

Result: Callbacks fire eagerly (important for WebSocket clients to see intermediate updates) while saves are batched (reducing I/O).


Test 5: Existing Test Suite

Ran all state serialization tests:

uv run pytest tests/sdk/conversation/local/test_state_serialization.py -v

Result: All 31 tests pass, including the new test_context_manager_batches_saves.

Issues Found

None.


Verdict: This PR achieves its performance optimization goal without breaking existing functionality. The implementation is clean, well-tested, and preserves backward compatibility. Ready to merge.

@csmith49 csmith49 requested a review from VascoSch92 May 12, 2026 18:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

perf: ConversationState autosave on every field mutation

2 participants