fix(#106, #81): stream parse-worker results in chunks to prevent OOM/SIGABRT#121
fix(#106, #81): stream parse-worker results in chunks to prevent OOM/SIGABRT#121san360 wants to merge 15 commits into
Conversation
… repro - Extract emit/assemble logic into parse-chunking.ts (leaf module) used by both worker and parent - Add parse-chunking.test.ts: round-trip parity, orphan preservation, chunk-size invariance (CI-safe) - Add scripts/generate-synthetic-logs.ts: temp-dir-only VS Code log generator (cross-platform) - Add scripts/oom-repro.ts + npm run test:oom-synth: opt-in large-tree OOM regression (forks real worker, redirects HOME to temp)
Make partial parses discoverable and add live worker telemetry while removing the issue-106 OOM diagnostic instrumentation. - Add a parse-warning collector (recordFailedFile/recordSkippedLines) and carry authoritative skipped file/line counts through the worker done payload -> ParseResult -> dataReady, independent of progress ticks (so it works on cache hits too). - Show a compact, dismissible dashboard banner with a 'View details' link that reveals the 'AI Engineer Coach' output channel; wrap content in #content-col so the banner spans the content column. - Add a live telemetry strip (memory/CPU/skipped gauges) sampled by the worker via createTelemetrySampler. - Log per-file parse warnings to the output channel. - Remove issue-106 heap/OOM diagnostics (report-on-fatalerror, stderr capture, logWorkerMemory/peak/heap-ceiling, logLargeFile) while keeping streaming readers, chunked IPC backpressure, and maybeForceGc.
…ading-grid modules Behavior-preserving, test-gated readability pass for issue #106. Splits the largest modules (parser.ts, parser-vscode.ts, app.ts) into focused units and removes rule-of-three duplication. +22 tests, no runtime behavior change.
…, and skipped banner Adds tests for the previously-untested OOM-fix surface and extracts three small testable units: - parser-vscode-files: forEachJsonlLine / forEachJsonlLineAsync streaming readers (incl. multibyte chunk-boundary) - parser-vscode-cli: parseCLIEventsFileAsync + recordFailedFile path (new test file) - parse-worker-stream: extracted shouldSendProgressImmediately + createAckWindow (backpressure) - parser-worker-host: extracted isRetryableWorkerError; fork-injected IPC dispatch/retry tests - skipped-banner: extracted buildSkippedBanner/formatSkippedSummary from app.ts - shared: dataReady skipped-count forwarding +47 tests (1150 total). Behavior-preserving; typecheck/lint/knip/build clean.
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.Scanned FilesNone |
|
I did an independent local validation of this PR against a real Windows repro with a large number of sessions (multiple harnesses). Environment:
On the current released extension, the dashboard failed during the cold-parse phase. The parser worker exited before the dashboard could load, matching the failure mode reported in #106/#81. In the debug output, the worker reached phase=2 detail=Cold parse, retried with the larger heap limit, and still exited with a V8 heap/OOM-style failure. I then checked out this PR, installed dependencies, built it locally, and ran the updated worker against the same accumulated local assistant history using the default worker heap limit. With this PR, the cold parse completed successfully. That gives me confidence this addresses the practical real-world failure mode behind #106/#81, not only the synthetic repro path. I validated the parser/worker path directly rather than doing a full packaged-extension install smoke test, but based on that validation I support merging this fix. Happy to provide sanitized aggregate details if they would be useful. |
Summary
Fixes the parse-worker out-of-memory / SIGABRT crash that occurs during a cold parse of large session volumes. The worker previously serialized the entire
ParseResultinto a single IPC message, which blew past the child process heap (and the native IPC buffer) on machines with large log histories — producingexit code 3758096392(0xE0000008, JS heap OOM) andSIGABRT.Closes #106
Closes #81
Root Cause
The forked parse worker built a complete in-memory
ParseResultfor every session and sent it to the parent in oneprocess.send()call. On large histories:--max-old-space-size, crashing the child with a JS heap OOM (0xE0000008).SIGABRT.Fix
Phase 1 — Reduce peak memory (
efa8ba9)Phase 2 — Stream results in chunks (
cd78e31)ParseResultto the parent in per-session chunks instead of one giant message.seq - highestAcked >= windowSize, preventing the native IPC buffer from growing without bound.Resilience & UX
heap out of memory,memory limit,SIGABRT,SIGKILL, abnormal exit) and retries once with a capped heap.Refactor (
5e477d7)Extracted focused, independently testable modules:
parse-worker-stream.ts—shouldSendProgressImmediately,createAckWindow(ack-window backpressure logic).parser-worker-host.ts—isRetryableWorkerError, injectableforkfor testability.parser-vscode-request.ts,session-totals,loading-grid,worker-telemetry.ts,skipped-banner.ts.Tests
Added streaming reader, worker-host, ack-window, telemetry, and skipped-banner coverage, including:
€, 3-byte) placed at a read-chunk boundary to verify the streaming JSONL reader'sStringDecodercarries partial sequences correctly.role=status), dismiss, and "View details" callback.Also added a synthetic-log generator (
scripts/generate-synthetic-logs.ts) and a manual OOM repro (npm run test:oom-synth) for reproducing the original crash locally.Validation
tsc --noEmit)node esbuild.mjs)Notes
origin/main, integrating upstream security hardening (prototype-pollution guards inparser-vscode-files.ts, external-URL validation inpanel.ts).CHANGELOG.mdupdated under Unreleased.