Skip to content

Refactor context handling, improve event types, and enhance docs#45

Merged
fuseraft merged 24 commits into
mainfrom
cleanup
Jun 13, 2026
Merged

Refactor context handling, improve event types, and enhance docs#45
fuseraft merged 24 commits into
mainfrom
cleanup

Conversation

@fuseraft

Copy link
Copy Markdown
Owner

Summary

Broad cleanup and hardening touching three areas:

  1. completed the event system: 55 new EventTypes constants wired across orchestrators, plus checkpointing, model invocation, and cancellation events;
  2. improved context quality with a lost-in-the-middle mitigation (Context Manifest, task reminder sandwich, enriched tombstones);
  3. fixed a batch of correctness bugs in SubAgentPlugin, FileSystemPlugin, context pipeline, and terminal output rendering.

Type of change

  • Bug fix
  • New feature
  • Plugin
  • Config example
  • Documentation
  • Refactor / cleanup
  • Other:

Related issues

No related issue

Changes

  • Checkpointing, model, and cancellation events: wired checkpoint_*, model_call/response/error/timeout, and cancellation_observed emit sites in RunCommand, SessionRunner, AgentFactory, and GraphOrchestrator
  • Lost-in-the-Middle mitigation: ApplyWithManifest appends a [Context Manifest] at the recency end; ContextAssembler repeats a brief task reminder past 2,000 chars; tombstones now include the evicted tool label and a 300-char content preview
  • Context pipeline fixes: session context moved to recency boundary; TrimToWindow drain bug (off-by-one allowing both messages in a pair to be removed); knowledge re-appended every turn fixed
  • FileSystemPlugin state invalidation: delete_file, delete_directory, move_file, copy_file, write_file, and patch_file now correctly evict stale entries from per-turn sets, session cache, version store, and summary cache; adds FileVersionStore.RemoveAsync
  • SubAgentPlugin correctness: fixes null token fields on streaming path, 2-minute timeout cap, TakeLast(40) transcript bound, dynamic tool priority filtering, safe EmitAsync in catch blocks, sub_agent_end on silent catch, stable workspaceRoot param
  • Display noise: budget warnings no longer wrap inside spinner context; tool-failure previews capped to first line at 60 chars; compaction internals downgraded to Debug
  • Viz: per-turn token bar chart added to ctx_viz.html; events.jsonl loaded alongside ctx_snapshots.jsonl with validation_fail/tool_blocked annotations
  • Docs: AGENTS.md, docs/context-management.md, and docs/sessions.md updated; playwright-mcp config example added

Testing

  • Added or updated unit tests
  • All tests pass locally (./build.sh --target=Test)
  • Tested manually end-to-end
  • Documentation updated where relevant
  • config/examples/ updated if config schema changed

Invariants

  • Execution order unchanged (Selection → Validation → Failure handling → Termination → Iteration cap)
  • Any new file/shell/git tool is wrapped by ChangeTracker
  • Any new validator is deterministic, side-effect free, and idempotent
  • Physical history is not stripped or reordered outside of compaction

Notes for reviewers

Scott Stauffer added 24 commits June 10, 2026 10:03
- ApplyWithManifest appends a [Context Manifest] listing active and
  superseded tool results so the model knows what it can still see
  without re-reading full history at the recency end of the prompt
- ContextAssembler repeats a brief task reminder at the end of any
  assembled context exceeding 2 000 chars, sandwiching the objective
  at both the primacy and recency positions
- Tombstones now include the evicted tool label and a 300-char content
  preview so the model can judge whether to re-read without issuing a
  blind full-file read
- ApplyWithManifest previously called Apply then rebuilt callLabels with a
  second scan of the original context; ApplyCore does one pass and returns
  the map + evicted flag, eliminating the redundant work
- extract task-reminder thresholds into named constants in ContextAssembler
  to make the intent readable without inline comments
- add tests for the all-evicted manifest path and the disabled-budget
  same-reference fast path
- Task Reminder and Context Manifest are non-obvious behavioral features
  that affect how agents perceive their own context; worth calling out so
  contributors know to preserve the primacy+recency sandwich invariant
- Tombstone format changed (now includes label + preview) — example in
  the doc prevents stale assumptions about the old single-line format
- Add ContextAssembler and ToolResultWindowTrimmer to Where-to-look table
- Tombstone example was the old one-liner; now shows the enriched format
  with tool label and content preview that ships in this branch
- Context Manifest (appended when evictions occur) was not documented
- Task Reminder step was missing from both pipeline overview and diagram
- Session context was injected at position 1, burying it mid-history where
  models attend least; moved to recency boundary so it is read last
- TrimToWindow could drain the final message pair: the assistant removal
  only checked start < list.Count, not start + 1 < list.Count, allowing
  both messages to be removed in a single loop iteration
- Pipeline Knowledge was re-appended every turn regardless of whether
  identical content already existed in history, compounding context growth
- Roslyn on Windows sees two resolution paths for ContextAddCommand when
  both fuseraft.Cli.Commands and fuseraft.Cli.Commands.Context are imported,
  triggering an ambiguous reference error; type aliases make each name
  resolve to exactly one type on all platforms
- Bar chart at top of ctx_viz.html shows input and output tokens per
  turn with toggle buttons to show/hide each dataset independently
- Events.jsonl is now loaded alongside ctx_snapshots.jsonl; validation_fail
  and tool_blocked events render as annotated vertical lines on both charts
- Context assembly details (chars, tool count, assembly time) from events.jsonl
  surface in bar and line chart tooltips when present
- Only four event types are loaded (turn_end, validation_fail, tool_blocked,
  context_assembly) to keep embedded HTML size reasonable
- Inline string literals scattered across 18 files created silent
  breakage risk: a typo in any emitter or consumer would compile fine
  but produce events that no hook, filter, or viz would ever match
- const string fields work as switch patterns (unlike static readonly),
  so EventLogViewer's switch expressions adopt them without ceremony
- Eliminates 47+ raw string literals across Cli, Infrastructure, and
  Orchestration layers — typos caused silent data gaps in events.jsonl
  with no compile-time signal
- Extends EventTypes with all missing constants so every event string
  has a single authoritative definition
- Aligns naming with context_budget_warn so the distinction is obvious:
  budget = token accumulation, window = message-count cap
- Both events signaled the same condition from different layers; a
  source field in the payload now distinguishes graph_orchestrator,
  keyword_strategy, and state_machine_strategy without a separate type
Defines agent lifecycle (AgentStart/End/Error/Timeout), tool completion
(ToolResult/Error/Timeout), retry (RetryScheduled/Attempt/Exhausted),
termination (TerminationSatisfied/MaxTurnsExceeded), parallel branches
(ParallelBranchStart/End/Error), HITL outcomes (HitlApproved/Rejected/
Resolved), model invocation, selection strategy, knowledge retrieval,
artifact lifecycle, and checkpointing/replay constants.

Wires emit calls in AgentOrchestrator, GraphOrchestrator, ChangeTracker
(CapturingMiddleware), and SessionRunner. Adds four SessionRunner tests
covering CancellationRequested, MaxTurnsExceeded, and HitlResolved
(agent-blocked and validator-stuck paths) using hook-based
synchronization for fire-and-forget emits.

false
- Checkpoint lifecycle (created/loaded/resume_started/resume_completed,
  event_replay_start/complete) needed emission sites; wired in RunCommand
  and SessionRunner where the emitter is already in scope
- Model invocation events (model_call/response/error/timeout) wired in
  AgentFactory middleware chain alongside existing inner_call_context;
  model_timeout also added to GraphOrchestrator streaming catch sites
  so SSE idle timeouts surface at the model level, not just turn/agent
- cancellation_observed distinguishes clean inter-turn stops from
  mid-turn OperationCanceledException (cancellation_requested); wired
  at the proactive IsCancellationRequested check in SessionRunner
- JsonSessionStore gains OnCorruptionDetected callback for
  event_corruption_detected; avoids circular dependency by using a
  Func delegate instead of a direct EventEmitter reference
- EventEmitter docblock updated to reference EventTypes instead of
  listing a stale hand-picked subset
- docs/sessions.md gains an orchestration event types reference table
  covering all wired event groups
- Eliminates typo-induced silent failures by centralising the canonical
  strings in one place; a rename now only touches OrchestratorTypes.cs
- Covers ValidateConfigCommand, WorkflowDiagramGenerator,
  OrchestratorBuilder, StrategyFactory, and KeywordSelectionStrategy
- Eliminates typo-induced silent failures; a mode rename now only
  touches CompactionModes.cs
- CompactionConfig.Mode default left as a bare string to avoid
  introducing a Core.Models → Orchestration layer dependency
- Mirrors the OrchestratorTypes/CompactionModes pattern to eliminate
  scattered inline literals that could silently diverge
- Capitalizes "orchestrator" → "Orchestrator" for consistency with all
  other reserved names (System, Human, Assistant, Verifier, Unknown)
- Eliminates scattered inline string literals for validator names, matching the pattern established by AgentNames and CompactionModes
- Replaces ToLowerInvariant switch in GraphOrchestrator with OrdinalIgnoreCase if/else chain for consistency with StrategyFactory
- Replaces raw "assistant"/"user" string literals with MessageRole constants to eliminate scatter across orchestrators and CLI
- Moves CompactionReason out of CompactionCoordinator into its own file, consistent with CompactionModes and similar constants classes
- delete_file, delete_directory, move_file, and copy_file left stale
  entries in per-turn sets, session cache, version store, and summary
  cache; subsequent reads or version checks on affected paths could
  return incorrect hints or conflict errors
- write_file and patch_file did not invalidate the summary cache, so
  get_file_summary could return an outdated cached summary indefinitely
  after edits
- add FileVersionStore.RemoveAsync so deleted/moved paths are evicted
  from the version store rather than leaving a stale version number
- Streaming sub_agent_end no longer emits null token fields — ChatResponseUpdate
  has no Usage property, so token fields are omitted on the streaming path
- Locate borrowed explore's 8-minute timeout; now has its own 2-minute cap via
  a new timeoutMinutes param on RunLoopAsync
- Diagnose transcript was unbounded; TakeLast(40) prevents context explosion on
  long sessions
- Tool priority lists in prompts were hardcoded, so the model could be
  instructed to call tools that weren't registered; now filtered dynamically
- EmitAsync calls in catch blocks could swallow the return value if the emitter
  threw; each is now wrapped in its own try/catch
- Silent catch in DiagnoseAsync and CriticReviewAsync now emits sub_agent_end
  with outcome=error so failures are visible in the event log
- workspaceRoot constructor param added so callers can pass a stable path
  instead of relying on Directory.GetCurrentDirectory()
- Budget warning messages were single long strings that wrapped inside the
  Spectre Status context, causing the spinner's \r\x1b[2K to clobber the
  second visual line and bleed residual tool-call text into warnings
- WRN tool-failure previews showed 120 chars of shell output with newlines
  replaced by spaces, wrapping to three terminal lines; now shows only the
  first line of the result, capped at 60 chars
- Compaction internals logged at Info were redundant with the user-facing
  AnsiConsole messages the coordinator already prints; downgraded to Debug
@fuseraft fuseraft merged commit abca9d7 into main Jun 13, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant