Skip to content

Parse worker OOM crash (exit code 3758096392) during cold parse of large session volumes #106

@san360

Description

@san360

Bug: Parse Worker OOM Crash During "Sync Sessions"

Summary

Clicking Sync Sessions crashes with "Child process exited with code 3758096392" when the user has a large volume of Copilot session history. The child parse-worker process hits V8's heap limit and is killed by the runtime on both attempts (4 GB and 6 GB --max-old-space-size).

Error Details

Exit code: 3758096392 (0xE0000008) — Windows V8 heap out-of-memory termination signal.

Full Error Log

2026-06-05T09:48:29.342Z pid=60700 [parser] child-progress-phase | rss=737.7MB heap=397.9MB/452.2MB ext=23.0MB | attempt=1 phase=1 detail=Computing directory fingerprints
2026-06-05T09:48:29.455Z pid=60700 [parser] child-progress-phase | rss=742.7MB heap=392.4MB/456.4MB ext=23.0MB | attempt=1 phase=2 detail=Cold parse
2026-06-05T09:48:39.289Z pid=60700 [parser] child-exit | rss=889.0MB heap=500.6MB/559.6MB ext=30.4MB | attempt=1 code=3758096392 signal=
2026-06-05T09:48:39.290Z pid=60700 [parser] child-retry | rss=889.0MB heap=500.6MB/559.6MB ext=30.4MB | reason=Child process exited with code 3758096392 maxOldSpaceMb=6144
2026-06-05T09:48:39.291Z pid=60700 [parser] child-start | rss=889.0MB heap=500.6MB/559.6MB ext=30.4MB | attempt=2 logsDirs=3 worker=...dist\parse-worker.js maxOldSpaceMb=6144
2026-06-05T09:48:39.418Z pid=60700 [parser] child-progress-phase | rss=889.0MB heap=503.7MB/559.6MB ext=30.4MB | attempt=2 phase=1 detail=Computing directory fingerprints
2026-06-05T09:48:39.429Z pid=60700 [parser] child-progress-phase | rss=889.0MB heap=503.8MB/559.6MB ext=30.4MB | attempt=2 phase=2 detail=Cold parse
2026-06-05T09:48:48.939Z pid=60700 [parser] child-exit | rss=934.2MB heap=526.5MB/580.5MB ext=37.2MB | attempt=2 code=3758096392 signal=
2026-06-05T09:48:48.940Z pid=60700 [panel] loadData-error | rss=934.2MB heap=526.5MB/580.5MB ext=37.2MB | Error: Child process exited with code 3758096392
    at ...dist\extension.js:28035:29
    at finish (dist\extension.js:28032:9)
    at fail (dist\extension.js:28035:9)
    at ChildProcess.<anonymous> (dist\extension.js:28086:11)
    at ChildProcess.emit (node:events:509:28)
    at ChildProcess._handle.onexit (node:internal/child_process:295:12)

User-Facing Error

The webview panel shows:

Error
Child process exited with code 3758096392

Root Cause Analysis

The crash occurs in the cold parse path (parseAllLogsViaWorkerparse-worker.ts). Both memory and disk cache miss (stale directories), so the worker must parse all session data from scratch.

Why It OOMs

Factor Detail
All sessions held in memory simultaneously processWorkspaces() accumulates every parsed Session object (with full messageText, responseText, code blocks, tool calls) into a single growing array. Nothing is released until the parse completes.
IPC serialization doubles memory After parsing, the worker calls process.send({ type: 'result', payload }) which internally runs JSON.stringify() on the entire result set. This creates a second copy of all data as a string, roughly doubling peak heap usage.
stripSessionsForMemory called too late Memory-efficient stripping (removes responseText, truncates messageText) only runs AFTER the full parse is complete and the result is about to be sent. The peak memory has already been reached.
prefetchBatch competes for heap Up to 600 files (each up to 20 MB) are pre-read into prefetchCache and held in memory alongside the growing sessions array.
Retry doesn't reduce work The retry logic (RETRY_WORKER_MAX_OLD_SPACE_MB = 6144) simply bumps the heap limit but processes the same full dataset with the same all-at-once strategy.

Key Code Locations

  • src/core/parser.tsparseAllLogsViaWorker() (line ~630): forks child, defines retry logic
  • src/core/parser.tsprocessWorkspaces() (line ~303): batched workspace parsing accumulating all sessions
  • src/core/parse-worker.ts — Worker entry point; parses everything then sends one massive IPC message
  • src/core/cache.tsstripSessionsForMemory() (line ~394): strips response text but only after parse completes
  • src/core/parser-vscode.tsparseSessionFile(): reads and parses individual session JSON files

Constants

const WORKER_MAX_OLD_SPACE_MB = 4096;       // attempt 1
const RETRY_WORKER_MAX_OLD_SPACE_MB = 6144; // attempt 2
const BATCH_SIZE = 32;                       // workspace processing batch
const MAX_PREFETCH_FILES = 600;
const MAX_PREFETCH_FILE_SIZE = 20 * 1024 * 1024; // 20 MB per file

Reproduction

  • User with 3+ log directories and a heavy Copilot usage history (many workspaces, many sessions per workspace)
  • No valid cache (first run, or cache invalidated by changed directories)
  • Click "Sync Sessions" button in the panel

Proposed Fix Strategies

Strategy 1: Stream results via chunked IPC (High Impact)

Instead of accumulating all sessions and sending one massive { type: 'result' } message, send results incrementally per-workspace:

{ type: 'chunk', sessions: [...], workspaceId: '...' }
{ type: 'chunk', sessions: [...], workspaceId: '...' }
{ type: 'done', dirMetas: {...} }

This eliminates the 2x peak from JSON.stringify of the entire payload and allows the parent to start processing sooner.

Strategy 2: Strip sessions eagerly during parse

Call stripSessionsForMemory() (or a per-session equivalent) immediately after each session is parsed and the full-text disk cache entry is written. This reduces the live heap size of accumulated sessions significantly.

Strategy 3: Per-workspace incremental cache

Persist cache fragments per workspace directory instead of one monolithic cache file. On stale cache, only re-parse the changed workspaces. This makes cold parse situations rare for established users.

Strategy 4: Reduce prefetch pressure during cold parse

During a cold parse (no cache hit), reduce MAX_PREFETCH_FILES or disable prefetch entirely since the memory budget is already under pressure from the growing sessions array.

Strategy 5: Write result to temp file instead of IPC

Have the child write the serialized result to a temp file (streaming JSON or msgpack), then signal the parent to read it. This avoids Node.js IPC buffering and allows the child to GC the live objects before the serialized string is fully written.

Breaking Change Assessment

This is NOT a breaking change to fix. All proposed strategies are internal implementation changes:

  • No public API surface is affected
  • No configuration schema changes required
  • No data format changes (the cache format can be versioned as it already is with CACHE_VERSION)
  • The user-facing behavior improves (sync completes instead of crashing)

However, if Strategy 3 (per-workspace cache) is adopted, the cache format will change:

  • Old monolithic cache file should be migrated or invalidated gracefully
  • Bump CACHE_VERSION to trigger re-parse on first run after update

Testing Requirements

Platforms to Test

Platform Priority Reason
Windows (x64) P0 Exit code 0xE0000008 is Windows-specific; this is the reported platform
Windows (ARM64) P1 Same V8 behavior expected, different memory characteristics
macOS (Apple Silicon) P1 Different memory allocator; OOM signal will be different (SIGABRT/SIGKILL)
macOS (Intel) P2 Legacy but still used
Linux (x64) P1 Common CI/remote dev environment; OOM killer behavior differs

Test Scenarios

  1. Large session volume cold parse — 3+ log directories, 100+ workspaces, invalidated cache → verify parse completes without OOM
  2. Memory regression — Measure peak RSS/heap before and after fix with the same dataset
  3. Incremental parse after fix — Ensure partial cache invalidation correctly re-parses only stale workspaces
  4. IPC integrity — If chunked IPC is adopted, verify all sessions arrive correctly and ordering is preserved
  5. Graceful degradation — If memory is still tight, verify the error message is user-friendly and suggests actions (e.g., "clear old sessions")

Benchmark Script

The repo already has scripts/benchmark-memory.ts which should be extended to cover the cold-parse path with a realistic dataset size.

Environment

  • OS: Windows 11 (x64)
  • VS Code Extension: ai-engineer-coach.ai-engineer-coach-0.1.0
  • Node.js: (VS Code embedded runtime)
  • Log directories: 3
  • Peak RSS at crash: 934.2 MB
  • Peak heap at crash: 526.5 MB / 580.5 MB (used/committed)

Metadata

Metadata

Assignees

Labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions