Skip to content

fix: prevent memory leaks from SSE streams, LSP, Bus, and process cleanup#15646

Open
brendandebeasi wants to merge 7 commits intoanomalyco:devfrom
brendandebeasi:fix/sse-memory-leaks
Open

fix: prevent memory leaks from SSE streams, LSP, Bus, and process cleanup#15646
brendandebeasi wants to merge 7 commits intoanomalyco:devfrom
brendandebeasi:fix/sse-memory-leaks

Conversation

@brendandebeasi
Copy link

@brendandebeasi brendandebeasi commented Mar 2, 2026

Issue for this PR

Fixes #15645
Relates to #10913, #9140, #11696, #9156, #14091, #15592

Type of change

  • Bug fix
  • New feature
  • Refactor / code improvement
  • Documentation

What does this PR do?

Fixes multiple memory leaks that cause OpenCode's memory usage to grow unbounded over time, particularly during long-running sessions.

Memory leak fixes (always active)

Fix Issues File Change
SSE cleanup #15645 server.ts, global.ts, routes.ts done flag + writeSSE error handling + unsub on abort
LSP shutdown #10913, #9140 client.ts diagnostics.clear() + reset files Map in shutdown
Bus dispose #10913 bus/index.ts subscriptions.clear() after dispose callback
ACP stdin #11696 acp.ts .on().once() for end/error listeners
GitHub unsub #9156 github.ts Capture + call Bus.subscribe return in finally block
Process exit #10913, #14091 index.ts Instance.disposeAll() with 5s timeout before process.exit()

Debug tooling (always available, zero overhead)

  • Bus.debug() — subscription count introspection per event type
  • Instance.debug() — cache state inspection
  • State.debug() — state entry counts
  • GET /debug/memory — on-demand runtime memory diagnostics endpoint
  • SIGUSR1 signal — captures heap snapshot + diagnostic report to ~/.local/share/opencode/diagnostics/

Memory monitoring (opt-in via OPENCODE_DIAGNOSTICS=1)

The polling monitor and auto-kill are disabled by default. To enable:

OPENCODE_DIAGNOSTICS=1 opencode

When enabled:

  • 30s polling interval checks RSS usage
  • 2GB warning — logs heap stats + writes periodic diagnostic reports (60s cooldown)
  • 4GB kill (configurable via OPENCODE_MEMORY_LIMIT=N in GB) — writes heap snapshot + diagnostic report, then exits
  • Kill threshold formula: max(2GB, min(25% total RAM, 4GB))

How did you verify your code works?

  • All modified files pass LSP diagnostics
  • Type errors are pre-existing (3 TUI errors: Property 'tui' does not exist on type 'Config' — unrelated to this PR)
  • Diagnostics opt-in verified: without OPENCODE_DIAGNOSTICS=1, only SIGUSR1 handler is registered (no polling, no kill)
  • With OPENCODE_DIAGNOSTICS=1, full monitoring loop activates

Checklist

  • I have tested my changes locally
  • I have not included unrelated changes in this PR

@github-actions
Copy link
Contributor

github-actions bot commented Mar 2, 2026

The following comment was made by an LLM, it may be inaccurate:

Potential duplicate/related PRs found:

  1. fix: consolidate memory leak fixes from #14650 and #8953 #15435 - "fix: consolidate memory leak fixes from fix: resolve memory leak issues across multiple subsystems #14650 and feat(bash): stream large output to tmpfile with filtering and GC #8953"

    • Related to memory leak fixes with potential overlap in SSE/Bus listener cleanup scope
  2. fix: resolve memory leak issues across multiple subsystems #14650 - "fix: resolve memory leak issues across multiple subsystems"

    • Addresses broader memory leak issues that may include SSE stream handling
  3. fix(core): add dispose functions to prevent subscription memory leaks #7032 - "fix(core): add dispose functions to prevent subscription memory leaks"

    • Earlier work on subscription memory leak prevention that may be relevant to Bus listener cleanup

These PRs all address memory leak issues in similar areas. Review #15435 and #14650 particularly to ensure the SSE stream cleanup in PR #15646 doesn't duplicate existing fixes or conflict with prior memory leak consolidation efforts.

@brendandebeasi
Copy link
Author

Reviewed the related PRs flagged above — this PR is complementary, not duplicative:

The core fix in this PR — awaiting writeSSE() calls, catching write errors to detect dead connections, and cleaning up Bus listeners + heartbeat intervals on failure — is not addressed by any of the other open PRs. Those PRs fix leaks in other subsystems; this one fixes the SSE stream handler leak path specifically.

@brendandebeasi brendandebeasi changed the title fix: prevent SSE stream memory leaks on client disconnect fix: prevent memory leaks from SSE streams, LSP, Bus, and process cleanup Mar 3, 2026
@brendandebeasi brendandebeasi force-pushed the fix/sse-memory-leaks branch 2 times, most recently from 66587a7 to 5788c8b Compare March 3, 2026 19:21
@binarydoubling
Copy link

FYI there are now multiple PRs tackling memory leaks from different angles — #16695 (TUI listener cleanup, Maps/Sets eviction, timer cleanup), #16730 (database bloat, SQLite config), #16346 (plugin subscriber stacking, msgs retention), #14650 (subagent deallocation, event-reducer caps). Lots of file overlap across all of these. Might be worth coordinating merges to avoid conflict hell.

binarydoubling added a commit to binarydoubling/opencode that referenced this pull request Mar 9, 2026
Addresses the remaining memory leaks identified in anomalyco#16697 by
consolidating the best fixes from 23+ open community PRs into
a single coherent changeset.

Fixes consolidated from PRs: anomalyco#16695, anomalyco#16346, anomalyco#14650, anomalyco#15646,
anomalyco#13186, anomalyco#10392, anomalyco#7914, anomalyco#9145, anomalyco#9146, anomalyco#7049, anomalyco#16616, anomalyco#16241

- Plugin subscriber stacking: unsub before re-subscribing in init()
- Subagent deallocation: Session.remove() after task completion
- SSE stream cleanup: centralized cleanup with done guard (3 endpoints)
- Compaction data trimming: clear output/attachments on prune
- Process exit cleanup: Instance.disposeAll() with 5s timeout
- Serve cmd: graceful shutdown instead of blocking forever
- Bash tool: ring buffer with 10MB cap instead of O(n²) concat
- LSP index teardown: clear clients/broken/spawning on dispose
- LSP open-files cap: evict oldest when >1000 tracked files
- Format subscription: store and cleanup unsub handle
- Permission/Question clearSession: reject pending on session delete
- Session.remove() cleanup chain: FileTime, Permission, Question
- ShareNext subscription cleanup: store unsub handles, cleanup on dispose
- OAuth transport: close existing before replacing

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Brendan DeBeasi and others added 7 commits March 10, 2026 16:24
- Add cleanup() guard with done flag to prevent double-cleanup
- await stream.writeSSE() calls and catch errors to trigger cleanup
- Unsubscribe Bus and GlobalBus listeners on abort or write failure
- Clear heartbeat interval in all exit paths
- Add Bus.debug() subscription count introspection
- Add GET /debug/memory endpoint for runtime memory diagnostics
- Cap SDK event queue to prevent unbounded growth during high event throughput
- Clean up message parts when trimming excess messages in sync store
- Evict previous session data from memory when switching sessions
- Bound LSP diagnostics map with LRU eviction and clear on shutdown
- Reject pending callbacks on session cancel to prevent promise/closure leaks

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add onCleanup for all sdk.event.on() listeners in app.tsx, session route,
  and prompt component to prevent listener accumulation on re-render
- Add onCleanup for process SIGUSR2 handler in theme provider
- Add onCleanup for leader timeout in keybind provider
- Clear event queue on SDK context cleanup
- Remove empty subscription arrays in Bus and RPC listener maps
- Clear warning timeout in state disposal after completion
- Clear models refresh interval on process exit
- Add dispose() to ShareNext to clear pending sync timeouts
- Clear PTY session buffer on removal to free memory immediately

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Addresses the remaining memory leaks identified in anomalyco#16697 by
consolidating the best fixes from 23+ open community PRs into
a single coherent changeset.

Fixes consolidated from PRs: anomalyco#16695, anomalyco#16346, anomalyco#14650, anomalyco#15646,

- Plugin subscriber stacking: unsub before re-subscribing in init()
- Subagent deallocation: Session.remove() after task completion
- SSE stream cleanup: centralized cleanup with done guard (3 endpoints)
- Compaction data trimming: clear output/attachments on prune
- Process exit cleanup: Instance.disposeAll() with 5s timeout
- Serve cmd: graceful shutdown instead of blocking forever
- Bash tool: ring buffer with 10MB cap instead of O(n²) concat
- LSP index teardown: clear clients/broken/spawning on dispose
- LSP open-files cap: evict oldest when >1000 tracked files
- Format subscription: store and cleanup unsub handle
- Permission/Question clearSession: reject pending on session delete
- Session.remove() cleanup chain: FileTime, Permission, Question
- ShareNext subscription cleanup: store unsub handles, cleanup on dispose
- OAuth transport: close existing before replacing

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Port robust process exit detection from PR anomalyco#15757 to fix zombie/stuck
child processes in containers where Bun fails to deliver exit events.

- Add polling watchdog to bash tool and Process.spawn that detects
  process exit via kill(pid, 0) when event-loop events are missed
- Add process registry (active map) with stale/reap exports for
  server-level watchdog to detect and clean up stuck bash processes
- Improve Shell.killTree with alive() helper and proper SIGKILL
  escalation after SIGTERM timeout
- Add session-level watchdog interval in prompt loop to periodically
  reap stale bash processes

Based on the work in anomalyco#15757.

Co-Authored-By: Nacho F. Lizaur <NachoFLizaur@users.noreply.github.com>
Complete the port of PR anomalyco#15757 with remaining pieces:

- Add stdio end event redundancy as third fallback for exit detection
  (fires when pipe file descriptors close, independent of exit events)
- Add diagnostic log.info calls at spawn, abort, timeout, and each
  exit detection path for debugging container issues
- Add comprehensive tests: defensive patterns, polling watchdog
  isolation, Shell.killTree, server-level watchdog (stale/reap),
  stdio end events, and Process.spawn defensive patterns
- Skip truncation tests on Windows (matching upstream)

Co-Authored-By: Nacho F. Lizaur <NachoFLizaur@users.noreply.github.com>
On Windows, stdio pipe end events can fire before the exit event
populates proc.exitCode, causing it to be null in the result metadata.
Fall back to 0 (or 1 if signalCode is set) when exitCode is null,
matching the same pattern used in Process.spawn.

Co-Authored-By: Nacho F. Lizaur <NachoFLizaur@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SSE event streams leak memory on client disconnect

2 participants