🤖 feat: coalesce consecutive file_read/file_edit tool calls in transcript#3379
🤖 feat: coalesce consecutive file_read/file_edit tool calls in transcript#3379ammar-agent wants to merge 9 commits into
Conversation
…ript Introduce a rule-based coalescing pass that hides bursts of consecutive file_read or file_edit_* tool calls behind a single 'Read files a, b' or 'Wrote files a, b' summary row. The summary replaces the head call's slot so the 1 -> N transition is a content swap rather than a row insertion, eliminating layout flash when the second member streams in. Clicking the summary expands the group and shows each individual tool call exactly as before. Layered on top of the existing bash_output coalescing pattern: - computeToolCoalesceInfos returns aligned head/member metadata once per message snapshot (O(n)). - CoalescedToolCall is a presentational summary row sized to match a collapsed FileReadToolCall so the swap is dimensionally stable. - ChatPane owns the expansion-state Set, keyed by the head's message id, and resets it on workspace switch (mirrors expandedBashGroups). Why only file_read and file_edit for now: those are the two highest- volume read/write tools, and they already share extractToolFilePath as a path source. Future kinds can plug in by extending ToolCoalesceKind.
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 2f5d6b30f6
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Bursts often touch the same file several times (e.g. multi-hunk edits). Showing 'a.ts, a.ts, b.ts' adds noise without information, so dedupe in the renderer while preserving first-occurrence order. The underlying filePaths array (and totalCount) still reflect every coalesced tool call, so future surfaces that need the raw history still have it.
|
@codex review Pushed a follow-up that dedupes repeated file paths in the coalesced summary so a burst that touches |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 201bf0e254
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Addresses Codex P1: hiding a partial tool row eats the InterruptedBarrier the transcript renders for interruption signals. When the stream stops mid-burst we want every member of the burst to keep its uncoalesced row so the user still sees "interrupted" exactly where it happened. Implemented as a single pass over each candidate group's members: if any has isPartial=true, the whole group falls back to normal rendering.
|
@codex review Addressed the P1 by refusing to coalesce a group that contains an interrupted ( |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c50c781d3c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…t partial Addresses Codex P2: pending/executing/failed/interrupted/redacted rows all carry status or error UI the summary row deliberately omits. Hiding any such row in a burst would mask the actionable signal until the user manually expands. Generalises the previous partial-only guard to a single 'isCoalesceableToolMember' check: a member must be type=tool, !isPartial, and status=completed. Anything else flips the whole group back to normal rendering.
|
@codex review Addressed the P2 by generalising the per-member check: a group only coalesces when every member is |
|
@codex review The previous review was against the partial-only commit. The current head ( |
|
Codex Review: Didn't find any major issues. Nice work! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
The previous version supplied a custom decorators array on the story meta, which shadowed lightweightMeta's StoryUiShell decorator and therefore stripped TooltipProvider. Storybook's runner picked that up as 'Tooltip must be used within TooltipProvider' for every story. Render via an in-story StoryLayout wrapper instead so the meta-level decorator stack stays intact. Opt out of Chromatic snapshots for these visual-reference stories to keep the global snapshot budget happy.
…y interactions The previous stories rendered the component but had no play functions, so test-storybook treated them as smoke-render-only. Convert each story into a real interaction test that exercises the new pattern's surfaces: - TwoReadsClickToExpand drives the click toggle through both branches (collapsed -> expanded -> collapsed) and asserts aria-expanded. - ManyReads asserts chronological-order joined paths. - TwoEdits asserts the past-tense 'Wrote files' verb for the file_edit kind. - DeduplicatedPaths asserts the display-only dedupe of repeated file names (5 raw calls -> 3 unique paths in first-occurrence order). - InitiallyExpanded asserts aria-expanded reflects the initial prop. Stories run under test-storybook in CI (the dedicated 'Test / Storybook' job in pr.yml), which exercises play functions in a real browser via the @storybook/test-runner. Chromatic visual snapshots stay disabled to respect the global snapshot budget.
… rebalance) Reviewers asked for at least one Chromatic visual snapshot of the new coalesced-tool-call row so the new UX has a pinned visual baseline, not just play-driven behavior coverage. Changes: - Add `CoalescedToolCall.visual.stories.tsx` containing one Chromatic-enabled story (`CollapsedReadFiles`) that renders the canonical new behavior: the collapsed "Read files src/App.tsx, src/main.ts" row. The sibling `CoalescedToolCall.stories.tsx` keeps its meta-level `chromatic: CHROMATIC_DISABLED` and continues to own the interaction (play) coverage for click-to-expand, dedupe, the `file_edit` verb, and the initial-expanded state. - Opt `ProvidersSection.stories.tsx` out of Chromatic snapshots at the meta level. All four stories in that file already drive `play` functions that assert the relevant settings-panel behavior (empty state, configured providers, env-sourced indicators, expanded provider config), and the layout has no animation/scroll-fade/state nuances worth a pixel baseline. This frees four snapshots from the global Chromatic budget. - Net delta on `tests/ui/storybook/budget.test.ts`: +1 (new visual story) -4 (ProvidersSection) = -3, taking the estimated snapshot count from 302 (pre-existing breach on main) to 299, comfortably under the 300 cap. Validation: - `bun test ./tests/ui/storybook/budget.test.ts` -> both budget assertions pass (files ≤ 70, snapshots ≤ 300). - `make typecheck`, `make lint`, `make fmt-check`, `make storybook-build` all clean. - Targeted unit + storybook tests: `CoalescedToolCall.test.tsx`, `toolCoalescing.test.ts`, and `tests/ui/storybook/` all pass. --- _Generated with `mux` • Model: `anthropic:claude-opus-4-7` • Thinking: `xhigh` • Cost: `$29.08`_ <!-- mux-attribution: model=anthropic:claude-opus-4-7 thinking=xhigh costs=29.08 -->
|
Added a single Chromatic baseline for the new coalesced-tool-call row in To keep the global snapshot budget ( Net delta: +1 (new visual story) − 4 (ProvidersSection) = −3 ⇒ estimated count goes from 302 (pre-existing main overage) to 299, comfortably under the 300 cap. @codex review |
|
Codex Review: Didn't find any major issues. 🎉 ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
Summary
Add automatic coalescing for consecutive
file_readandfile_edit_*tool calls in the transcript. The first call in a run renders normally; when a second arrives, the slot converts in-place to a Coalesce row that summarizes the burst (e.g. Read files src/App.tsx, src/main.ts). Clicking the row expands the group and reveals each underlying tool call exactly as before.Background
Long agent turns frequently emit bursts of file reads/edits that push more useful content (assistant prose, terminal output) off-screen. The existing
bash_outputcollapsing pattern already does this for repeated polls of the same process; this PR extends the same idea to file IO bursts behind a rule-based registry so additional kinds (web fetches, advisor, …) can opt in later.Implementation
src/browser/utils/messages/toolCoalescing.ts—computeToolCoalesceInfos(messages)walks the transcript in one O(n) pass and returns per-index{ kind, position: "head" | "member", filePaths, headIndex, totalCount }. Groups of size ≥ 2 with the sameToolCoalesceKind(file_readorfile_edit) qualify.src/browser/features/Tools/CoalescedToolCall.tsx— presentational summary row that mirrors a collapsedFileReadToolCall's dimensions (sameToolContainerpadding, same row height). That keeps the 1 → N transition a content swap rather than a row insertion, so users do not see a layout flash when the second member streams in mid-turn.src/browser/components/ChatPane/ChatPane.tsx— wired alongside the existingbashOutputGroupInfosflow:toolCoalesceInfosonce per snapshot,MessageRendererwithCoalescedToolCallwhen collapsed,expandedToolCoalesceGroupsSet<headId>and resets it on workspace switch (mirroringexpandedBashGroups).The summary uses past-tense copy matching the request (
Read files …/Wrote files …). Allfile_edit_*variants (replace_string,replace_lines,insert) collapse under a singlefile_editkind so a mixed burst still forms one block; the active registry only routes two of those today but the legacy variant is included for historical transcripts.Validation
bun test src/browser/utils/messages/toolCoalescing.test.ts— 10 cases covering head/member layout, mixed file_edit variants, kind boundaries, intervening tool/user messages, multi-group transcripts, and unknown path fallback.bun test src/browser/features/Tools/CoalescedToolCall.test.tsx— 4 happy-dom cases for collapsed/expanded copy,onTogglewiring, andaria-expandedstate.bun test src/browser/features/Tools src/browser/features/Messages src/browser/utils/messages— 612 pass, 0 fail.make static-check— clean.TwoReads,ManyReads,TwoEdits,ExpandedReads) underApp/Chat/Tools/CoalescedToolCallfor visual review.Risks
Render-path change in
ChatPane.tsxmap block — the same hot path that handlesbash_outputcollapsing. Mitigations:findTranscriptMessageElementselects on[data-message-id]. The summary row deliberately omits that attribute so scroll-into-view targeting still resolves to real messages (head MessageRenderer when expanded; the user/assistant rows in all other cases).[data-testid="chat-message"](truncation, fork, task report relocation) all use bash/task/assistant fixtures, not file_read/file_edit, so they are unaffected.Generated with
mux• Model:anthropic:claude-opus-4-7• Thinking:xhigh