fix: prevent memory leaks from SSE streams, LSP, Bus, and process cleanup#15646
fix: prevent memory leaks from SSE streams, LSP, Bus, and process cleanup#15646brendandebeasi wants to merge 7 commits intoanomalyco:devfrom
Conversation
|
The following comment was made by an LLM, it may be inaccurate: Potential duplicate/related PRs found:
These PRs all address memory leak issues in similar areas. Review #15435 and #14650 particularly to ensure the SSE stream cleanup in PR #15646 doesn't duplicate existing fixes or conflict with prior memory leak consolidation efforts. |
|
Reviewed the related PRs flagged above — this PR is complementary, not duplicative:
The core fix in this PR — awaiting |
1f990f0 to
11eef33
Compare
66587a7 to
5788c8b
Compare
|
FYI there are now multiple PRs tackling memory leaks from different angles — #16695 (TUI listener cleanup, Maps/Sets eviction, timer cleanup), #16730 (database bloat, SQLite config), #16346 (plugin subscriber stacking, msgs retention), #14650 (subagent deallocation, event-reducer caps). Lots of file overlap across all of these. Might be worth coordinating merges to avoid conflict hell. |
Addresses the remaining memory leaks identified in anomalyco#16697 by consolidating the best fixes from 23+ open community PRs into a single coherent changeset. Fixes consolidated from PRs: anomalyco#16695, anomalyco#16346, anomalyco#14650, anomalyco#15646, anomalyco#13186, anomalyco#10392, anomalyco#7914, anomalyco#9145, anomalyco#9146, anomalyco#7049, anomalyco#16616, anomalyco#16241 - Plugin subscriber stacking: unsub before re-subscribing in init() - Subagent deallocation: Session.remove() after task completion - SSE stream cleanup: centralized cleanup with done guard (3 endpoints) - Compaction data trimming: clear output/attachments on prune - Process exit cleanup: Instance.disposeAll() with 5s timeout - Serve cmd: graceful shutdown instead of blocking forever - Bash tool: ring buffer with 10MB cap instead of O(n²) concat - LSP index teardown: clear clients/broken/spawning on dispose - LSP open-files cap: evict oldest when >1000 tracked files - Format subscription: store and cleanup unsub handle - Permission/Question clearSession: reject pending on session delete - Session.remove() cleanup chain: FileTime, Permission, Question - ShareNext subscription cleanup: store unsub handles, cleanup on dispose - OAuth transport: close existing before replacing Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
5788c8b to
a275bde
Compare
- Add cleanup() guard with done flag to prevent double-cleanup - await stream.writeSSE() calls and catch errors to trigger cleanup - Unsubscribe Bus and GlobalBus listeners on abort or write failure - Clear heartbeat interval in all exit paths - Add Bus.debug() subscription count introspection - Add GET /debug/memory endpoint for runtime memory diagnostics
- Cap SDK event queue to prevent unbounded growth during high event throughput - Clean up message parts when trimming excess messages in sync store - Evict previous session data from memory when switching sessions - Bound LSP diagnostics map with LRU eviction and clear on shutdown - Reject pending callbacks on session cancel to prevent promise/closure leaks Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add onCleanup for all sdk.event.on() listeners in app.tsx, session route, and prompt component to prevent listener accumulation on re-render - Add onCleanup for process SIGUSR2 handler in theme provider - Add onCleanup for leader timeout in keybind provider - Clear event queue on SDK context cleanup - Remove empty subscription arrays in Bus and RPC listener maps - Clear warning timeout in state disposal after completion - Clear models refresh interval on process exit - Add dispose() to ShareNext to clear pending sync timeouts - Clear PTY session buffer on removal to free memory immediately Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Addresses the remaining memory leaks identified in anomalyco#16697 by consolidating the best fixes from 23+ open community PRs into a single coherent changeset. Fixes consolidated from PRs: anomalyco#16695, anomalyco#16346, anomalyco#14650, anomalyco#15646, - Plugin subscriber stacking: unsub before re-subscribing in init() - Subagent deallocation: Session.remove() after task completion - SSE stream cleanup: centralized cleanup with done guard (3 endpoints) - Compaction data trimming: clear output/attachments on prune - Process exit cleanup: Instance.disposeAll() with 5s timeout - Serve cmd: graceful shutdown instead of blocking forever - Bash tool: ring buffer with 10MB cap instead of O(n²) concat - LSP index teardown: clear clients/broken/spawning on dispose - LSP open-files cap: evict oldest when >1000 tracked files - Format subscription: store and cleanup unsub handle - Permission/Question clearSession: reject pending on session delete - Session.remove() cleanup chain: FileTime, Permission, Question - ShareNext subscription cleanup: store unsub handles, cleanup on dispose - OAuth transport: close existing before replacing Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Port robust process exit detection from PR anomalyco#15757 to fix zombie/stuck child processes in containers where Bun fails to deliver exit events. - Add polling watchdog to bash tool and Process.spawn that detects process exit via kill(pid, 0) when event-loop events are missed - Add process registry (active map) with stale/reap exports for server-level watchdog to detect and clean up stuck bash processes - Improve Shell.killTree with alive() helper and proper SIGKILL escalation after SIGTERM timeout - Add session-level watchdog interval in prompt loop to periodically reap stale bash processes Based on the work in anomalyco#15757. Co-Authored-By: Nacho F. Lizaur <NachoFLizaur@users.noreply.github.com>
Complete the port of PR anomalyco#15757 with remaining pieces: - Add stdio end event redundancy as third fallback for exit detection (fires when pipe file descriptors close, independent of exit events) - Add diagnostic log.info calls at spawn, abort, timeout, and each exit detection path for debugging container issues - Add comprehensive tests: defensive patterns, polling watchdog isolation, Shell.killTree, server-level watchdog (stale/reap), stdio end events, and Process.spawn defensive patterns - Skip truncation tests on Windows (matching upstream) Co-Authored-By: Nacho F. Lizaur <NachoFLizaur@users.noreply.github.com>
On Windows, stdio pipe end events can fire before the exit event populates proc.exitCode, causing it to be null in the result metadata. Fall back to 0 (or 1 if signalCode is set) when exitCode is null, matching the same pattern used in Process.spawn. Co-Authored-By: Nacho F. Lizaur <NachoFLizaur@users.noreply.github.com>
a275bde to
491b449
Compare
Issue for this PR
Fixes #15645
Relates to #10913, #9140, #11696, #9156, #14091, #15592
Type of change
What does this PR do?
Fixes multiple memory leaks that cause OpenCode's memory usage to grow unbounded over time, particularly during long-running sessions.
Memory leak fixes (always active)
server.ts,global.ts,routes.tsdoneflag +writeSSEerror handling + unsub on abortclient.tsdiagnostics.clear()+ resetfilesMap in shutdownbus/index.tssubscriptions.clear()after dispose callbackacp.ts.on()→.once()for end/error listenersgithub.tsfinallyblockindex.tsInstance.disposeAll()with 5s timeout beforeprocess.exit()Debug tooling (always available, zero overhead)
Bus.debug()— subscription count introspection per event typeInstance.debug()— cache state inspectionState.debug()— state entry countsGET /debug/memory— on-demand runtime memory diagnostics endpointSIGUSR1signal — captures heap snapshot + diagnostic report to~/.local/share/opencode/diagnostics/Memory monitoring (opt-in via
OPENCODE_DIAGNOSTICS=1)The polling monitor and auto-kill are disabled by default. To enable:
When enabled:
OPENCODE_MEMORY_LIMIT=Nin GB) — writes heap snapshot + diagnostic report, then exitsmax(2GB, min(25% total RAM, 4GB))How did you verify your code works?
Property 'tui' does not exist on type 'Config'— unrelated to this PR)OPENCODE_DIAGNOSTICS=1, only SIGUSR1 handler is registered (no polling, no kill)OPENCODE_DIAGNOSTICS=1, full monitoring loop activatesChecklist