Skip to content

fix(orchestrator): serialize auto-compact self-dispatch via GroupQueue.enqueueTask (LIA-367)#985

Merged
sliamh11 merged 2 commits into
mainfrom
fix-auto-compact-concurrency
Jul 3, 2026
Merged

fix(orchestrator): serialize auto-compact self-dispatch via GroupQueue.enqueueTask (LIA-367)#985
sliamh11 merged 2 commits into
mainfrom
fix-auto-compact-concurrency

Conversation

@sliamh11

@sliamh11 sliamh11 commented Jul 3, 2026

Copy link
Copy Markdown
Owner

Summary

  • The /compact self-dispatch in message-orchestrator.ts fired runAgent(group, '/compact', ...) un-awaited, mid-callback, guarded only by a closure-scoped Set<string>. This bypassed GroupQueue's one-container-per-group serialization entirely — a second container could spawn and race the primary turn's container on the same host IPC directory (neither call site sets ipcRunKey, so both resolve to the identical unscoped path).
  • Reroutes the dispatch through the existing GroupQueue.enqueueTask primitive — the same pattern already used identically in odysseus-server.ts and task-scheduler.ts for "serialize a self-dispatch against the group's normal turn." The compact dispatch now only fires after the primary runAgent call has fully resolved, and because GroupQueue.state.active is still true at that point, enqueueTask always queues rather than running concurrently — drainGroup picks it up under full serialization once the primary container's state is cleanly reset.

Important context: this specific race is currently unreachable in production

Traced the full data path for result.contextStats/result.compactionEvent (the fields that gate the dispatch): ContainerRuntime.runTurn's onOutput only forwards output_text/activity/session/error/turn_complete; RuntimeEvent has no variant carrying contextStats/compactionEvent at all; runAgent's eventSink only synthesizes {status,result}-shaped callback objects. So result.contextStats?.autoCompact is always undefined today — confirmed independently by an existing code comment at src/config.ts:184-191 ("the runtime eventSink does not forward contextStats... tracked follow-up") and by git log -S contextStats showing zero commits ever wiring that forwarding path since the auto-compact feature was introduced (a08d0e71, LIA-94).

This means DEUS's own proactive auto-compact trigger has been silently inert since introduction — the SDK's own internal auto-compaction (a separate mechanism) still works, so nothing is on fire. This PR ships as cheap, precedented preventive hardening for a landmine that would activate the moment the forwarding gap closes (a follow-up ticket will track that separately) rather than as a fix for an actively-reproducing bug — flagging so reviewers don't expect an observable behavior change or an E2E repro.

Test plan

  • npx tsc --noEmit && npx tsc — clean
  • npx vitest run — 109 files, 1942 tests, 0 failures
  • New tests: message-orchestrator.test.ts (regression-pin: real event pipeline never calls queue.enqueueTask, documenting current unreachability), group-queue.test.ts (pending-task dedup branch, previously only covered for running tasks)

Note on merge order

This PR and #984 (silent agent-run failure notice, separate PR) both touch message-orchestrator.test.ts's makeQueue() test helper (different new keys) and the top of createMessageOrchestrator. Whichever merges second will need a small rebase to reconcile both.

🤖 Generated with Claude Code

sliamh11 and others added 2 commits July 3, 2026 21:07
…e.enqueueTask (LIA-367)

The /compact self-dispatch fired un-awaited mid-callback, bypassing
GroupQueue's one-container-per-group serialization and racing the primary
turn's container on the same unscoped IPC directory. Defer the dispatch
until the primary turn returns and route it through the existing
enqueueTask primitive (already used identically in odysseus-server.ts and
task-scheduler.ts), which always queues while the group is still active.

Note: contextStats/compactionEvent never reach this callback today (the
RuntimeEvent/eventSink plumbing doesn't forward them, see config.ts:184-191),
so this is preventive hardening for a currently-dead code path, not a fix
for an active race — a follow-up ticket will track closing that forwarding
gap so DEUS's own auto-compact trigger can actually fire.

Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com>
@github-actions github-actions Bot added the core label Jul 3, 2026
@sliamh11 sliamh11 merged commit feb4683 into main Jul 3, 2026
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant