Skip to content

feat(notifications): broadcast hardening + informativeness (combines #39 + #40)#41

Merged
artyomsv merged 13 commits into
masterfrom
feat/notifications-with-broadcast-hardening
Jun 8, 2026
Merged

feat(notifications): broadcast hardening + informativeness (combines #39 + #40)#41
artyomsv merged 13 commits into
masterfrom
feat/notifications-with-broadcast-hardening

Conversation

@artyomsv

@artyomsv artyomsv commented Jun 8, 2026

Copy link
Copy Markdown
Owner

Summary

Combines PR #39 (notifications informativeness) and PR #40 (broadcast hardening) into a single PR with clean linear history. Broadcast wedge defense lands first as the foundation; notification UX work builds on top.

Closes #39, closes #40.

What's in here

Phase A — broadcast hardening (foundation)

  • Per-conn 64-slot buffered send goroutine; Broadcast is non-blocking
  • Marshal once, share frame across N conns with slices.Clone defensive copy
  • CAS-guarded overflow: one log, one async close per slow conn, zero broadcast stall
  • 4 KiB per-event Message cap + 1 KiB per-Data-value cap (raised from 128 B to fit multi-line data.excerpt)
  • Reserved _quil_truncated namespace flag with compile-time invariant guard
  • 30 s SetWriteDeadline belt-and-suspenders
  • Server.ConnCount() test accessor
  • logger.Warn / logger.Error for new code (project leveled logger)

Notifications informativeness

  • Events carry excerpts (Message + data.excerpt); sidebar renders excerpt under title
  • Per-pane mute (Alt+M, persisted in workspace.json)
  • Active-pane idle suppression (TUI-side filter)
  • (PaneID, Title) aggregation with ×N badge; AddEvent updates in place + moves to front
  • \r-reset cleanup in lastNLines (handles OSC 0 window-title leak)
  • Prompt-only excerpt suppression (recognises %, $, , hostname patterns)
  • ANSI boundary scan also seeks ESC byte (handles claude-code TUI's no-newline regions)
  • Default notification.max_events raised 50 → 200
  • Dead [[notification_handlers]] emits one-shot deprecation warning on plugin load

MCP surface

  • New dismiss_notifications tool — agents can ack events
  • watch_notifications gains since_timestamp — closes the race between agent action and watcher registration
  • All events carry data.excerpt with full multi-line content (fits within 1 KiB Data cap)

Commit list (8 commits, in foundation-first order)

  1. 54125eb fix(ipc): non-blocking broadcast + 4 KiB per-event size cap
  2. 8b177af fix(ipc): address review findings on PR fix(ipc): non-blocking broadcast + 4 KiB per-event size cap #40
  3. af85f36 feat(notifications): excerpts, mute, MCP dismiss + since_timestamp
  4. 8711c14 fix(notifications): ANSI boundary, prompt-only suppress, aggregation
  5. 01d2d70 fix(notifications): seek ESC byte in ANSI trim, log idle decisions
  6. 5340412 fix(notifications): CR-reset cleanup + OSC hostname leak as prompt
  7. 2e38d1a fix(daemon): raise Data value cap to 1 KiB for multi-line excerpts

Test plan

  • go vet ./... clean
  • go test ./... — all 17 packages pass
  • go test -race ./... — all packages pass under race detector
  • Comprehensive code review (security-officer, code-reviewer, rules-compliance, qa) on the broadcast hardening portion; all Critical/High/Medium findings addressed in commit 8b177af
  • 14 new tests in scope (broadcast resilience, conn internal, size cap boundary tests, excerpt rendering, mute, aggregation, find-since, active-pane suppression)
  • Manual smoke (when reviewed): launch dev TUI, exercise Alt+M mute, verify sidebar excerpts render under titles, verify ×N badges aggregate, kick off slow MCP client and confirm daemon stays responsive

What's left for PR #41

  • Phase B + C + D from the hook-events plan (~/.claude/plans/velvety-bubbling-twilight.md): the JSONL spool watcher, Claude/OpenCode hook script extensions registering the opinionated event tier, config knobs. Independent of this PR.

artyomsv added 7 commits June 8, 2026 14:12
Closes the wedge incident class. One stuck IPC client (slow TUI render,
crashed MCP bridge, kernel socket buffer congestion) can no longer
stall the daemon's broadcast fan-out for every other client.

internal/ipc/server.go — replace the synchronous Broadcast with a
per-Conn 64-slot buffered send queue and a dedicated send goroutine
per conn. Send is non-blocking: if the queue overflows the offending
conn is scheduled for async close and ErrSendOverflow surfaces in the
broadcast log, but other conns keep receiving. Broadcast also marshals
the wire frame once and shares the bytes across the per-conn queues
instead of re-serializing per client. Close is sync.Once-guarded so
the read and write halves can race to it safely.

internal/daemon/event.go — enforce a 4 KiB cap on PaneEvent.Message
and 128-byte cap per Data value at the IPC boundary (toPaneEventPayload).
Truncation keeps the tail (most recent / terminal-visible content) and
sets Data["truncated"]="1" so consumers can flag the event. The earlier
opencode-splash wedge produced a > 1 KiB box-drawing excerpt that
flooded the broadcast loop; this is the size-side defense.

Tests:
- TestBroadcast_SlowConnDoesNotBlockFastConn — 200 broadcasts with one
  conn refusing to read; assert the Broadcast loop returns within 1 s
  and the fast conn drains.
- TestBroadcast_ContinuesAfterSlowConnDisconnects — after the slow conn
  is torn down, new broadcasts still reach the surviving conn.
- TestToPaneEventPayload_* — truncation marker placement, tail
  preservation, under-cap pass-through, nil-Data no-allocation path.

All green under -race.
Apply all Critical / High / Medium / Low items surfaced by the code
review.

Source changes:

- internal/ipc/server.go
  * (C1) Broadcast now clones the marshaled frame via slices.Clone
    before fan-out. Today no callsite mutates it, but a future
    sync.Pool of bytes.Buffer would silently corrupt frames still
    being read by per-conn sendLoops; the clone makes that contract
    bullet-proof.
  * (L4 + M3) Overflow path is CAS-guarded — the warning log fires
    exactly once per slow conn and only one Close goroutine spawns,
    instead of N redundant log lines and N redundant goroutines while
    the conn is being reaped.
  * (H1 belt-and-suspenders) sendLoop sets a 30 s SetWriteDeadline per
    frame so a kernel-buffer wedge that doesn't error has a
    deterministic cleanup ceiling.
  * (M4) log.Printf migrated to logger.Warn / logger.Error so the
    project's leveled logger controls visibility.
  * (M2) Document why Send and sendFrame both check closed/overflow —
    outer check is the marshal-cost fast path, inner check is the
    race-safe gate next to the channel send.
  * (H2) Close() doc comment clarifies pending frames are intentionally
    discarded.
  * (L6) Guard comment on the conns snapshot copy in Broadcast warns
    against future "optimizations" that would reintroduce the wedge.
  * (M6) Server.Stop comment clarifies Daemon shutdown does not rely
    on a final IPC broadcast (durability lives in the on-disk
    workspace.json path).
  * New: Server.ConnCount() for test-friendly connect/disconnect sync.

- internal/daemon/event.go
  * (H4) Reserved Data key renamed to "_quil_truncated" — the new
    _quil_ namespace prevents accidental collision with caller-supplied
    keys (including a literal "truncated" key that emitters may use).
  * (H3) Compile-time invariant block asserts both caps strictly exceed
    len(truncationMarker); a future contributor lowering either cap
    below the marker length will fail to build instead of panic at
    runtime.
  * (Rules #1) const block reflowed; gofmt-clean.
  * (L1) Marker length explained (… is 3-byte UTF-8 + "[truncated]" =
    14 bytes, not 12).
  * (L2) Truncation strategy doc comment names the tail-keep rationale
    and points future emitters at TruncationStrategy if they need head.
  * (L3) TODO comment on sync.Pool trade-off (defer until profiling
    demands it; clone-then-pool is the correct sequence).

Test changes:

- internal/ipc/conn_internal_test.go (NEW, package ipc white-box)
  * TestSendLoop_ExitsOnWriteError — uses net.Pipe to force a write
    error mid-frame; asserts no goroutine leak (QA #1 finding).
  * TestConn_CloseIdempotent — 16 concurrent Close calls, none panic,
    closed flag set exactly once.
  * TestConn_SendFrameAfterCloseShortCircuits — sendFrame and Send
    after Close return ErrSendOverflow without touching the socket.

- internal/ipc/broadcast_resilience_test.go
  * (M5) Replaced time.Sleep-after-connect with waitForConnCount
    polling on the new ConnCount() accessor. No more CI flake risk
    from racing the connect-registration window.
  * (Rules #4) Added t.Parallel() to both tests.
  * (L7) TestBroadcast_MarshalErrorLogsAndReturns covers the
    previously-untested marshal-error path.

- internal/daemon/event_sizecap_test.go
  * Updated existing 4 tests for the renamed _quil_truncated key.
  * Added t.Parallel() to all 7 tests.
  * (QA #2) TestToPaneEventPayload_ExactCapBoundary verifies the > cap
    condition rejects an off-by-one refactor to >=.
  * (L8) TestToPaneEventPayload_BothOverCapSetsFlagOnce — when both
    Message and Data value exceed cap, the flag is set exactly once.
  * NEW TestToPaneEventPayload_ReservedKeyDoesNotClobberCaller proves
    the _quil_ namespace fix: a caller's literal "truncated" key
    survives untouched alongside the daemon's _quil_truncated flag.

CLAUDE.md: `internal/ipc/` bullet expanded with the resilience design
(sendCh, sendLoop, CAS overflow, ConnCount, size caps) per
documentation-maintenance rule.

NewClient audit (cmd/quil/main.go, mcp.go, version_gate.go): all
callsites either defer Close or explicitly close. SetWriteDeadline in
sendLoop is the runtime safety net for future regressions.

All green under -race.
Make the notification sidebar genuinely informative and close gaps in
the MCP API. Notification events now carry the last few stripped output
lines that triggered them, so the sidebar can render context (4-line
cards: separator + name/time + title + excerpt preview) and MCP agents
no longer need a follow-up read_pane_output per event. Per-pane mute
(Alt+M, persisted in workspace.json) drops events at the source for
chatty processes like `npm test --watch`. Active-pane output_idle
events are suppressed TUI-side as redundant. The MCP surface gains
dismiss_notifications (ack from the agent side) and watch_notifications
since_timestamp (closes the race between kicking off a task and
registering the watcher). Defaults bumped to 200 max events. Dead
notification_handlers code path now emits a deprecation warning at
plugin load instead of failing silently; docs corrected.
Address three signal-quality issues spotted while testing the sidebar:

1. ANSI sequence fragments leaked into excerpts. The trailing 4 KiB
   slice of the ring buffer could begin mid-CSI — the `\x1b[` ended up
   in the discarded prefix but parameters like `2;30;30;30m` or `;18H`
   survived into the window and ansi.Strip rendered them as plain text.
   New `trimToNewlineSafe` helper advances the slice start to the next
   newline before stripping, applied in both `paneOutputExcerpt` and
   `analyzeIdleTitle`.

2. Output_idle events whose excerpt collapses to a single shell prompt
   rune (%, $, >, ❯, #, ➜, λ, ») are now suppressed when the default
   "Output idle" title fired. Shells idling at a prompt are not a state
   change worth notifying — the user can see the prompt by looking at
   the pane. Plugin idle handlers (Claude "Needs your approval" etc.)
   still fire because their non-default title means the regex saw
   something meaningful.

3. eventQueue.Push aggregates repeat (PaneID, Title) events: the new
   event reuses the prior event's ID, bumps Data["count"], and replaces
   it at the front of the queue. The TUI sidebar's AddEvent updates the
   matching card in place and bubbles it to position 0. The title line
   now renders a ×N badge when count > 1. Replaces N near-duplicate
   cards saying the same thing with one bumping card.
Field test surfaced two issues my first round of follow-ups missed:

1. ANSI fragments STILL leaked in claude-code panes (`2;30;30;30m|
   Thinking: ...`, `0mConfig/Build files:- ...`). Root cause: the
   newline-only seek in trimToNewlineSafe fell through on TUIs that
   emit one screen paint with no newline inside the trailing 4 KiB
   window. The bounded scan now also accepts an ESC byte (0x1b) as a
   clean boundary — ansi.Strip handles a full sequence starting at
   ESC, even if the next newline is far away. Scan is bounded to 512
   bytes so it stays cheap.

2. Output-idle events with `%` excerpts kept firing — both panes had
   their cards collapsed to `×2` even though my isPromptOnlyExcerpt
   covers the bare `%` case. The screenshot shows what the sidebar
   renders (first non-empty line), not what the daemon's idle checker
   sees (the full 5-line excerpt). Add a debug-level log line at every
   idle decision (pane id, type, title, suppress verdict, excerpt
   prefix) so we can see why the check returns false on real-world
   shells. With the dev build's ldflags-forced debug level, this
   surfaces in the next test run's quild.log.
Field diagnosis with debug logs confirmed the real shape of the bug.
The terminal-pane idle excerpts logged in quild.log look like:

  "%   [padding]   \r \r\rArtjoms_Stukans@EPCHZURW03..."

That's: prompt rune, padding spaces, CR-return-to-start, more CRs, then
OSC 0 window-title content that leaks through ansi.Strip (the strip
state machine appears to bail on an embedded CR inside the OSC payload).
The sidebar then renders just "%" because that's the first non-empty
line — but the *actual* terminal-visible content is the hostname leak,
not the prompt that was immediately overwritten by the CRs.

Two coordinated changes:

1. lastNLines now applies per-line CR-reset semantics — splitting a line
   at the last `\r` and keeping only the trailing segment. This matches
   what the terminal would actually display, so excerpts no longer
   capture text the user can never see (the pre-CR prompt) and miss
   the text they DO see (the post-CR window title).

2. isPromptOnlyExcerpt is now suffix-based and recognises hostname-like
   patterns. A line is "prompt-like" when, after trimming, it ends with
   a recognised prompt rune (covers `%`, `user@host % `, `~/repo $ `,
   etc) OR matches a user@host pattern (the OSC 0 leak signature). Long
   lines (> 200 chars) are presumed to be command output regardless of
   trailing characters — real prompts are short.

The two changes interact: after CR-reset the excerpt's visible content
is the hostname leak; the new prompt classifier recognises that pattern
and suppresses correctly. `ls` output still survives suppression because
its line endings don't match prompt patterns. Verified against the
field-logged excerpts.
The combined PR's data.excerpt field (set by withExcerpt for MCP
consumers) is typically a multi-line stripped excerpt of 200-500 bytes.
The original 128-byte Data value cap from the broadcast-hardening PR
was tuned for short metadata (exit_code, tool name) and would silently
truncate the excerpt to a 1-2 line preview.

Raise the per-Data-value cap to 1 KiB. The Message field still has the
4 KiB cap for the full excerpt, but agents reading data.excerpt now
also get the realistic full content. Per-event ceiling becomes 4 KiB +
N × 1 KiB; realistic events have 2-4 keys so total stays well under
10 KiB.

Updated TestToPaneEventPayload_DataValueOverCapTruncates to use 4 KiB
input (guaranteed to exceed any reasonable future tuning); the exact-
boundary test already pins the cap edge precisely.
artyomsv added 6 commits June 8, 2026 15:38
High (3):

- H1 isPromptLikeLine over-suppression. "build complete: 100%" used to
  classify as prompt-like because the suffix rune match accepted any
  trailing prompt char. Now the rune must EITHER be the bare line OR be
  preceded by whitespace — so "% " matches, "user@host %" matches, but
  "100%" and "x$" do not. Two new test rows pin the regression.

- H2 idle-checker cooldown leak. checkIdlePanes was setting
  LastIdleEventAt = now BEFORE the prompt-only suppression decision, so
  a suppressed event still consumed the 30 s cooldown — a single
  spurious suppression caused 30 s of silence even if real activity
  followed. On suppression we now roll back LastIdleEventAt to the zero
  value; IdleNotified stays true so the next tick doesn't re-evaluate
  the same idle state (flushPaneOutput resets it on the next PTY byte).

- H3 sidebar cursor drift on aggregation. NotificationCenter.AddEvent
  on a same-ID match moves the card to position 0; the cursor stayed
  at its old index, silently jumping the user's selection onto a
  different card. Now we snapshot the cursor's event ID BEFORE the
  move-to-front and re-locate the cursor onto the same logical event
  afterward. Fresh prepends still leave the cursor at index 0 (legacy
  "cursor 0 = newest" semantics) — only the aggregation path chases.

Medium (4):

- CLAUDE.md L86: MCP tool count was "17"; now "18" including
  dismiss_notifications. mcp_tools.go implementation count was "15";
  now "18". The MCP tool list reflects all three notification additions
  (excerpt-carrying get_notifications, since_timestamp on watch_*,
  dismiss_notifications).

- docs/configuration.md: full [notification] section added documenting
  sidebar_width and max_events. Bindable actions table gains
  mute_pane, notification_toggle, notification_focus, go_back,
  notes_toggle rows — all of which were missing.

- daemon.go idle-decision debug log was emitting up to 200 bytes of
  raw excerpt content. Per observability-and-logging.md "Never log
  user-provided content" rule, even debug must not echo PTY bytes
  (panes routinely contain API keys, passwords from REPLs, .env on
  cat). Replaced with structural metadata: excerpt_bytes +
  excerpt_lines. Sufficient to diagnose suppression decisions without
  the secret-leak risk.

- internal/tui/dialog.go shortcuts list: added Alt+M mute, Alt+N
  notification toggle, F3 sidebar focus, Alt+Backspace pane history,
  Alt+E notes — all previously undiscoverable from F1.

- plugin/registry.go [[notification_handlers]] deprecation warning was
  per-load — a MsgReloadPlugins call would re-fire the warning for
  every stale plugin every time. Now one-shot per plugin name per
  daemon lifetime via a sync.Map sentinel. Daemon restart re-enables.

Low (6):

- Replaced developer's real OS username (Artjoms_Stukans) and
  corporate-shaped hostname (EPCHZURW03) in 3 test fixtures with
  synthetic user_name@host01 — the regex behavior is identical.

- Added t.Parallel() to all 7 net-new test files in this PR (44 test
  functions). The IPC tests were already parallel; daemon and tui
  tests were the inconsistency.

Coverage gaps (3):

- TestEventQueue_Push_CorruptCountResetsToBaseline pins the
  strconv.Atoi error branch in the aggregation count parser (corrupt
  Data["count"] resets to baseline of 1 instead of panicking).

- TestIsPromptLikeLine_EmptyString_DirectCall directly invokes
  isPromptLikeLine("") (previously only reached transitively through
  isPromptOnlyExcerpt's blank-line filter).

- TestLoadPluginTOML_NotificationHandlersDeprecated drops a legacy
  TOML fixture with [[notification_handlers]] in a t.TempDir() and
  asserts the load succeeds + NotificationHandlers slice still
  populates (back-compat preserved even after the deprecation).

- TestNotificationCenter_AddEvent_CursorFollowsLogicalEventOnAggregation
  is the H3 invariant pin: a cursor on a non-aggregated event must
  follow that event's new index after move-to-front of a different
  event.

All green under go test ./... and go test -race ./...
Daemon-side plumbing for the upcoming Claude / OpenCode hook-driven
notifications. No user-facing behavior change yet — the Claude .sh /
opencode .js producers don't emit to the spool until Phase C — but the
ingest pipeline is fully testable in isolation.

New package internal/hookevents/:

  types.go — Payload wire schema (v=1, ts_ms, seq, pane_id, src,
    hook_event, session_id, transcript_path, cwd, title, sev, data).
    Severity constants (info/warning/error), source constants
    (claude/opencode), hook-side caps (200 B title, 128 B data value,
    2 KiB total). Payload.Validate enforces required fields + bounded
    enums; sentinel errors via errors.Is.

  spool.go — Spool reads $QUIL_HOME/events/<paneID>.jsonl files. Init
    truncates stale files on daemon start (no replay). Tick polls every
    file, remembers per-file byte offset, skips trailing partial lines
    (the documented append-vs-read race), drops malformed / oversize
    lines with a warn log. Cleanup unlinks a pane's spool file on
    DestroyPane.

  ingest.go — Ingester sits between Spool/IPC and emit. Per-pane
    sliding-window rate limiter (100 events / 2s; on trip emits one
    "internal.event_storm" diagnostic and drops further events from
    that pane for 10 s). Per-(paneID, hook_event) 50 ms debounce
    coalescer; last payload wins with data["coalesced"] = burst count.
    now() is overridable so tests don't depend on wall clock.

Daemon wiring (internal/daemon/daemon.go):

  - New Daemon.hookSpool + Daemon.hookIngester fields. Init in Start()
    after EnsureScripts. Watcher goroutine launched alongside
    idleChecker, polls 200 ms, same shutdown discipline (FlushAll
    drains pending coalesce buffers before exit).

  - emitHookEvent translates Payload → PaneEvent, sets
    Pane.HookHealthy + Pane.LastHookEventAt under PluginMu, routes
    through the existing emitEvent (mute filter, aggregation,
    broadcast).

  - checkIdlePanes' shouldFire condition gains
    !(pane.HookHealthy && now-LastHookEventAt < 30 s) so the legacy
    idle excerpt steps aside for healthy hook panes. The 30 s grace
    period lets legacy idle reactivate as a fallback during long quiet
    turns — and panes whose hook plugin fails to load never trip
    HookHealthy at all, so they keep the legacy notification surface.

  - Both DestroyPane code paths (handleDestroyPane and
    handleDestroyPaneReq) call Spool.Cleanup before
    session.DestroyPane so the watcher's next tick can't race with the
    destroy.

internal/daemon/session.go — Pane gains LastHookEventAt time.Time +
HookHealthy bool, guarded by the existing PluginMu.

internal/config/config.go — new EventsDir() helper returning
$QUIL_HOME/events; mirrors ClaudeHookDir / SessionsDir.

Tests (15 new in hookevents/):

  - Payload validation: happy path + 6 error rows + empty severity
    tolerance.
  - Spool: appended lines read, offset survives, partial trailing line
    skipped and consumed after completion, malformed lines dropped,
    Init truncates, Cleanup unlinks, non-jsonl files ignored, oversize
    line dropped.
  - Ingester: burst coalesce to 1 with last-wins + count metadata,
    different events / panes stay distinct, FlushAll drains, rate limit
    trips at 100/2s with storm diagnostic emitted, limiter recovers
    after penalty (uses overridable clock).

All green under go test ./... and -race.
Claude (internal/claudehook/):

  - BuildSettingsJSON now registers Quil's hook command under 12 events
    (SessionStart + SessionEnd + UserPromptSubmit + Notification +
    PermissionRequest + Stop + PreCompact + PostCompact + SubagentStart +
    SubagentStop + TaskCreated + TaskCompleted) instead of SessionStart
    alone. The script branches on hook_event_name from stdin.

  - scripts/quil-session-hook.sh and .ps1 rewritten as multi-event
    routers. SessionStart keeps the original session-id-file behaviour
    (used by Quil's restore path); the other 11 events compose a v1
    hookevents.Payload and append one JSONL line to
    $QUIL_HOME/events/<paneID>.jsonl.

    Title/severity mapping per the plan:
      SessionEnd        → "Session ended"            info
      UserPromptSubmit  → "Working on: <preview>"    info
      Notification      → pass-through               warning
      PermissionRequest → "Needs approval: <tool>"   warning
      Stop              → "Reply ready"              warning
      PreCompact        → "Compacting context"       info
      PostCompact       → "Compaction complete"      info
      SubagentStart     → "Spawned: <agent_type>"    info
      SubagentStop      → "<agent_type> done"        info
      TaskCreated       → "Task: <content>"          info
      TaskCompleted     → "✓ <content>"              info

    Wire caps enforced at the hook side (title ≤ 200 chars, data values
    ≤ 128 chars, total ≤ 2 KiB) so the daemon's outer cap is purely a
    backstop. Title and previews are truncated with "…" suffix.

  - New TestBuildSettingsJSON_RegistersAllForwardedEvents pins the
    12-event registration so a future refactor that drops an entry from
    forwardedHookEvents fails fast.

OpenCode (internal/opencodehook/scripts/quil-session-tracker.js):

  - Plugin keeps the original session-id rotation tracking (recording
    to opencode-<paneID>.id for the resume path).

  - Generic `event` handler additionally forwards a filtered tier to
    the spool:
      session.idle           → "Reply ready"               warning
      session.error          → "Session error: <type>"     error
      session.compacted      → "Compaction complete"       info
      session.status (retry) → "Retrying API (attempt N)"  warning
      file.edited            → "Edited N files"            info
                              (batched 1s window, single line per burst)

  - Typed permission.ask hook → "Needs approval: <tool>" warning.

  - Typed experimental.session.compacting → "Compacting context" info.

  - Per-pane in-process token bucket (20 events/s sustained, 50 burst).
    Drops with a single warn-log when exhausted; recovers as the bucket
    refills. Independent of the daemon-side limiter (which lives in the
    Ingester) so two layers of defense protect the IPC fan-out.

  - Wire caps enforced before JSON.stringify: title ≤ 200 bytes UTF-8,
    each data value ≤ 128 bytes UTF-8, total line ≤ 2 KiB; oversize
    lines are dropped with a hook.log breadcrumb.

  - Per-pane seq counter so the daemon's coalescer has a tiebreaker
    when two events land in the same millisecond.

No daemon changes — the Phase B infrastructure consumes whatever the
hooks produce.

All green under go test ./...
User-facing surface for the hook events tier per the plan's Phase D
section. No new code paths in the daemon — just plumbing the config
value to the hook scripts so users can opt into verbose mode or turn
hook events off entirely without editing the embedded scripts.

internal/config/config.go:

  - New HookNotificationsConfig{Claude, OpenCode string} nested under
    NotificationConfig.Hooks (TOML key `notification.hooks`). Default
    values "default" set in Default() so a fresh install gets the v1
    tier from Phase C automatically.

internal/daemon/daemon.go:

  - claudeHookSpawnPrep and opencodeSpawnPrep gain a hookMode parameter
    threaded from d.cfg.Notification.Hooks.Claude /
    d.cfg.Notification.Hooks.OpenCode. Both pass QUIL_HOOK_MODE=<mode>
    in the PTY env (alongside the existing QUIL_PANE_ID / QUIL_HOME).
    Empty mode resolves to "default" so a user with no [notification]
    block still gets the v1 tier.

  - Tests updated: TestClaudeHookSpawnPrep / TestOpencodeSpawnPrep now
    assert the QUIL_HOOK_MODE env var is present and equal to
    "default" (the tests' fixed input mode).

docs/configuration.md:

  - Full [notification.hooks] subsection documenting `claude` and
    `opencode` knobs with the three tier values (`default` / `verbose`
    / `off`) and what each forwards. Hook events JSONL spool path
    mentioned so users can verify the daemon is consuming them.

  - Default config block adds the [notification.hooks] entries.

.claude/CLAUDE.md:

  - New `internal/hookevents/` bullet explaining the wire schema,
    Spool / Ingester / rate limiter / coalescer responsibilities, the
    HookHealthy fallback, and the JSONL spool lifecycle.

  - `internal/claudehook/` and `internal/opencodehook/` bullets updated
    to mention the multi-event extension, the QUIL_HOOK_MODE env, and
    the per-pane rate budget on the OpenCode side.

Phases B + C + D complete. The full hook-driven notification surface
is in place — Claude PermissionRequest and Stop, OpenCode permission.ask
and session.idle, et al — and runs through the daemon's coalesce + rate
limit + aggregation pipeline before the IPC fan-out.

All green under go test ./... and -race.
All Critical / High / Medium / Low items from the final review, plus
the test-coverage gaps QA flagged. Every fix has a one-liner in the
review notes for traceability.

## Critical (1)

C1 Ingester pending timer leak past pane destroy
  internal/hookevents/ingest.go — new Ingester.Cancel(paneID) stops
  every AfterFunc timer keyed by paneID and removes the per-pane rate
  bookkeeping. Wired into both handleDestroyPane / handleDestroyPaneReq
  paths in daemon.go so a destroyed pane cannot fire one final stale
  event ~50 ms later via the coalescer's natural window close.

## High (4)

H1 Path traversal via Spool.Cleanup(paneID)
  daemon.go — both destroy IPC handlers now validate payload.PaneID
  with isValidHexID before any FS-touching call. Defense-in-depth:
  spool.go safePaneID rejects /, \, .., NUL etc and Cleanup uses
  filepath.Clean + HasPrefix to ensure the resolved path stays under
  the spool dir. New tests TestSafePaneID and
  TestSpool_Cleanup_RejectsTraversalPaneID cover the guard.

H1 (CR) FlushAll shutdown race
  ingest.go — Ingester.closed flag set by FlushAll; subsequent Submit
  is a no-op. FlushAll now Stops every pending timer before draining
  so AfterFunc cannot fire after FlushAll returns. New tests
  TestIngester_Submit_AfterFlushAll_IsNoOp and
  TestIngester_FlushAll_StopsPendingTimers pin the invariants.

H2 QUIL_HOOK_MODE was a dead knob
  All three hook script producers (sh, ps1, js) now read $QUIL_HOOK_MODE
  and short-circuit the spool emission when mode == "off". Daemon-side
  backstop in emitHookEvent drops events when the resolved
  d.cfg.Notification.Hooks.<Source> is "off" — covers the case where
  the script-side gate didn't fire (older script on disk, env stripped).
  Storm diagnostics always pass.

H3 First hook event being storm flips HookHealthy=true
  daemon.go — emitHookEvent only sets pane.HookHealthy / LastHookEventAt
  when p.HookEvent != hookevents.EventStorm. The rate limiter's own
  self-emitted storm payload would otherwise mark a pane "hook-healthy"
  exactly when real events have stopped flowing, silencing the legacy
  idle excerpt fallback during the 10 s penalty window.

## Medium (10)

Sec M1 Cross-pane spoofing via payload.PaneID
  spool.go parseAndValidate now drops payloads whose pane_id field does
  not match the filename-derived paneID. A compromised plugin in pane A
  can no longer write events attributed to pane B.

Sec M2 Unbounded spool growth
  spool.go new rotation: when readPaneFile observes size == offset
  (fully drained) AND size >= 16 MiB, truncate the file and reset the
  offset. Rotation runs only on idle ticks so it never collides with
  in-flight reads.

CR M2 io.ReadAll OOM on multi-MB single line
  spool.go switches from io.ReadAll + bytes.Split to bufio.Reader.
  ReadBytes('\n') with per-line size enforcement. A 100 MB single line
  from a misbehaving producer no longer allocates into memory; it's
  detected at MaxTotalBytes and dropped after advancing past it.

CR M4 PowerShell JsonEscape backslash + C0 control bytes
  quil-session-hook.ps1 — backslash replacement now produces exactly
  two chars (the prior pattern produced four backslashes per input
  backslash, breaking every Windows path). Added a [regex]::Replace
  pass that converts [\x00-\x1f] to \uXXXX so ESC and other ANSI
  control bytes from Claude tool output don't break the JSON line.

CR M5 OpenCode JS clock-jump-backward
  quil-session-tracker.js — consumeToken clamps elapsed to ≥ 0 and
  resets the refill anchor on backward jumps. Without this, an NTP
  correction would freeze the rate limiter until the clock caught up.

CR M6 OpenCode JS UTF-8 mid-codepoint truncation
  quil-session-tracker.js — truncate walks backward over UTF-8
  continuation bytes (0x80-0xBF) so the cut lands on a codepoint
  boundary. The trailing "…" replaces clean truncated content instead
  of polluting it with U+FFFD.

CR M7 Unbounded timestamp on hook payload
  daemon.go emitHookEvent — clamps p.TsMs to within ±1 hour of the
  daemon's clock; events with skewed timestamps fall back to time.Now.

Rules #2 + #3 + #5 Observability + error wrapping
  spool.go — parsePayloads warns are sampled (1 in N per pane) to
  prevent log flooding from a misbehaving producer. Unmarshal errors
  log only the byte size, never err.Error() which could fragment user
  prompt content into quild.log. Spool.Init returns wrapped errors
  with package context.

Rules #4 docs/features.md updates
  features.md Notification center section now describes hook-driven
  events, the tier knob (default / verbose / off), the flow diagram,
  and the Alt+M mute keybinding.

## Low + polish

L1 forwardedHookEvents duplicate detection
  claudehook_test.go — TestForwardedHookEvents_NoDuplicates catches a
  silently-deduped slice entry at build time.

L2 FlushAll doc comment
  ingest.go — doc comment now correctly notes it's called from
  hookEventsWatcher, not Daemon.Stop.

L3 emitHookEvent Data map double-allocation
  daemon.go — single allocation with len(p.Data)+2 capacity.

L5 Dead awk truncate() function
  quil-session-hook.sh removed; replaced inline truncation guidance.
  json_escape simplified to a portable tr+sed pipeline (no GNU sed
  `:a;N;$!ba` slurp pattern; works on busybox / macOS sed).

Sec L2 PowerShell C0 control bytes
  Same fix as CR M4 above.

EventStorm constant
  ingest.go exports the literal "internal.event_storm" string so both
  producer (stormPayload) and consumer (emitHookEvent) reference the
  same constant.

## Test additions (7 new test functions)

- TestSafePaneID (table-driven, 9 cases)
- TestSpool_Cleanup_RejectsTraversalPaneID
- TestIngester_Cancel_DropsPendingForPane
- TestIngester_Cancel_DoesNotAffectOtherPanes
- TestIngester_Cancel_Idempotent
- TestIngester_Submit_AfterFlushAll_IsNoOp
- TestIngester_FlushAll_StopsPendingTimers
- TestForwardedHookEvents_NoDuplicates

All green under go test ./... and go test -race ./...
Three log lines so the watcher's state is visible without modifying
production behavior:

  - INFO at watcher start with the spool dir, so a misconfigured
    QUIL_HOME / EventsDir mismatch is obvious from line one of the
    log.
  - DEBUG per non-empty Tick with the payload count, so a watcher
    that runs but never finds events is distinguishable from one
    that runs and finds them but the ingester drops them.
  - DEBUG per successful emit with pane/src/hook_event/title, so a
    full-pipeline trace is one grep away.

Investigating a reported case where Claude hook events landed in
the spool ($QUIL_HOME/events/<paneID>.jsonl) but Pane.HookHealthy
never flipped to true. The new daemon will surface where the
pipeline stalls — directory mismatch, parse drop, or downstream
emit issue.

No behavior change for non-hook code paths.
@artyomsv artyomsv merged commit d04b003 into master Jun 8, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant