feat(web): parallel tasks + sidebar filter/sort/group, branch-switch safety, provider icons#96
Conversation
…curacy Concurrency (web task-architecture Phase 1+2): turn the web server from a single active session into N independent top-level tasks running in parallel. - Per-task Engine (internal/web/engine.go): the run-state singletons (agent, history, recorder, tokenUsage, approvalState, env, handler, running, runCancel) move into Engine; Server keeps transport + shared resources + tasks map + a bootstrap engine. - Per-task factory (command/web.go buildWebTask): every task gets a fresh tools.NewEnv, BackgroundManager, Recorder, TokenUsage, PlanStore, ApprovalState, WebHandler, per-task NotifyingHandler, and per-task transcript/reduction paths — no shared mutable execution state. - Route chat/stop/approval/ask/stats/pending/mode/model by task_id; remove the global single-run gate for a per-task running flag (tasks run concurrently). - Per-engine event pump stamps WSEvent.TaskID; WS clients subscribe per task and the broker filters delivery; pong is unicast. - Engine lifecycle: registerEngine/teardown/CloseAllEngines, idle-throwaway reclaim, live-engine cap, per-task PTY bucketing, run-cancel before recorder close. - Frontend echoes task_id on approval/ask resolve so background-task decisions route to the right engine. Locking discipline: s.mu guards only the active-engine pointer + server maps; each engine's mutable run state is guarded by its own emu; the session index read-modify-write writers are serialized by a package mutex; mcpTools is swapped via atomic.Pointer. Verified race-clean (go test -race) and deadlock-free (layered order s.mu > tasksMu > emu > recorder lock, no cycle). Token-usage accuracy (the review that started this work; foundational): - Split cumulative ledger vs context occupancy (ResetContext): compaction no longer wipes the ledger (C1) and triggers on last-call occupancy, not the cumulative prompt sum (C4). - Persist the per-turn delta on cancel/error via defer (C2). - Per-turn budget delta + cache-read discount pricing (C5/S3). - Cache-support detection via details-presence, not value>0 (C6). - Usage attribution follows a mid-session model switch (S1). - UI clarifies that cached is part of input and reasoning is part of output (S4). Developed with multi-agent adversarial review: 22 confirmed findings fixed, plus 3 self-introduced regressions caught and reverted in a verification pass. New concurrency + token-usage tests included. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Warning Review limit reached
More reviews will be available in 50 minutes and 56 seconds. Learn how PR review limits work. Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file). ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits. 🚦 How do rate limits work?CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate. For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (27)
📝 WalkthroughWalkthroughThe PR refactors the web server from a single shared agent to concurrent per-task ChangesToken Tracking, Cache Pricing & Usage Recording
Multi-task Engine Refactor
Sequence Diagram(s)sequenceDiagram
participant Client as Browser Client
participant WSBroker
participant Server as Web Server
participant Engine
participant Runner
rect rgba(70, 130, 180, 0.5)
Note over Client,Engine: Task creation and registration
Client->>Server: POST /api/sessions (session_id, pwd)
Server->>Engine: buildLocalEngine(taskID, pwd, mode)
Engine->>Engine: newEngine, registerEngine
Engine->>Engine: startPump goroutine started
Server->>Engine: setActiveEngine(eng)
Server-->>Client: {task_id, session_id}
end
rect rgba(60, 179, 113, 0.5)
Note over Client,Runner: Per-task chat execution with token tracking
Client->>Server: POST /api/chat (message, session_id)
Server->>Engine: resolveEngine(session_id)
Engine->>Engine: running.CAS(false→true)
Engine->>Runner: runner.Run(eng.tokenUsage, eng.recorder, ...)
Runner->>Runner: startSnap, BeginTurn, defer recordUsageTurn
Runner->>Runner: costLocked(sessionPrompt, sessionCompletion, sessionCached)
Runner-->>Engine: agent_text events→eventHandler channel
Engine->>Engine: startPump stamps TaskID, broadcasts
Engine->>WSBroker: Broadcast(WSEvent{TaskID, Type, Data})
WSBroker->>Client: deliver if client.wants(TaskID)
end
rect rgba(255, 165, 0, 0.5)
Note over Client,Server: Task-scoped approval resolution
Client->>Server: POST /api/approval (id, approved, task_id)
Server->>Engine: resolveEngine(task_id)
Engine->>Engine: handler.ResolveApproval(id, approved)
Server-->>Client: 200 OK
end
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 12
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
internal/web/git.go (1)
18-29:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winUse one workspace snapshot for both branch-list subprocesses.
Line 18 and Line 28 call
s.activePwd()independently. If the active workspace flips between those calls,branchesandcurrentcan be computed from different repositories.Suggested fix
func (s *Server) handleGitBranches(w http.ResponseWriter, r *http.Request) { + pwd := s.activePwd() listCmd := exec.CommandContext(r.Context(), "git", "for-each-ref", "--format=%(refname:short)", "--sort=-committerdate", "refs/heads") - listCmd.Dir = s.activePwd() + listCmd.Dir = pwd out, err := listCmd.Output() @@ curCmd := exec.CommandContext(r.Context(), "git", "branch", "--show-current") - curCmd.Dir = s.activePwd() + curCmd.Dir = pwd curOut, _ := curCmd.Output()🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/web/git.go` around lines 18 - 29, The code calls s.activePwd() twice independently for the listCmd and curCmd, which could result in the two git subprocesses querying different repositories if the active workspace changes between the two calls. Store the result of s.activePwd() in a single variable before setting the Dir field for both the listCmd and curCmd command objects to ensure both commands operate on the same workspace snapshot.internal/web/server.go (1)
1669-1694:⚠️ Potential issue | 🟠 Major | ⚡ Quick winUse one engine snapshot for session diffs.
handleSessionDiffreadssessionSnapshotfrom the active engine, then later callsactivePwd()again. If the foreground task changes mid-request, the diff can use task A’s snapshot in task B’s workspace.🐛 Proposed fix
func (s *Server) handleSessionDiff(w http.ResponseWriter, _ *http.Request) { snapshot := "" - if eng := s.activeEngine(); eng != nil { - eng.emu.Lock() - snapshot = eng.sessionSnapshot - eng.emu.Unlock() + eng := s.activeEngine() + if eng != nil { + eng.emu.Lock() + snapshot = eng.sessionSnapshot + pwd := eng.pwd + eng.emu.Unlock() + // Diff from snapshot to current working tree + cmd := exec.CommandContext(s.ctx, "git", "diff", snapshot, "--no-color") + cmd.Dir = pwd + output, _ := cmd.CombinedOutput() + // continue parsing output as below }The key change is to capture both
snapshotandpwdfrom the sameengpointer and use thatpwdfor the command.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/web/server.go` around lines 1669 - 1694, The handleSessionDiff function reads sessionSnapshot from the active engine while holding the lock, but then calls s.activePwd() later outside that lock context. If the foreground task changes between these calls, the snapshot could be from task A but the pwd from task B, causing a mismatch. Capture the pwd from the same eng pointer within the lock block (between eng.emu.Lock() and eng.emu.Unlock()), store it in a variable, and then use that captured pwd value for cmd.Dir instead of calling s.activePwd() again after the lock is released.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@internal/agent/budget.go`:
- Around line 169-184: The budget manager's session-cumulative totalCost is
being overwritten with per-turn token deltas. The per-turn usage from
TurnUsage() is correct for the per-turn cap check, but totalCost should reflect
cumulative session costs, not just the current turn. Instead of calculating
m.manager.totalCost using only the per-turn promptTokens, completionTokens, and
cachedTokens deltas, retrieve the cumulative session token usage (likely via a
Get() method on m.tokenUsage) and use those cumulative values when calling
m.manager.costLocked() to properly account for session-level budget. Keep the
per-turn TurnUsage() values separate for per-turn cap enforcement if needed.
In `@internal/command/web.go`:
- Around line 405-415: The per-task model cache (currentCM and currentCtxLimit)
is being updated before the makeAgent call completes. If makeAgent fails and
returns an error, the cache will still contain the new model/context limit,
causing incorrect reuse on subsequent mode rebuilds. Move the entire cache
update block that locks cmMu and updates currentCM, currentCtxLimit, and plan to
execute only after makeAgent succeeds, ensuring the cache is only modified when
agent construction is successful.
In `@internal/web/git.go`:
- Around line 65-66: The code has a Time-Of-Check-Time-Of-Use (TOCTOU) race
condition where the running state is checked on the active engine via
cur.running.Load() on line 65, but then the repository directory is resolved
again using s.activePwd() on lines 93 and 105. If the active engine switches
between the check and the actual git checkout, the guard will have validated one
engine while the operation executes in a different engine's repository. Fix this
by either using the directory path from the already-validated cur engine
instance instead of re-resolving via s.activePwd(), or by ensuring the active
engine cannot change between the running state check and the git checkout
execution to maintain consistency between the checked and executed engine.
In `@internal/web/remote.go`:
- Around line 188-192: The code calls s.activeEngine() separately at multiple
points (guard check and later teardown), which can target different engine
instances under concurrent workspace switches. Additionally, PTY teardown
happens before buildRemoteEngine completes, so a failed bind operation disrupts
the current task without actually switching workspaces. Capture a single
snapshot of the active engine at the start of the bind flow, use that same
snapshot for both the guard condition and later teardown operations, and defer
the PTY teardown until after buildRemoteEngine successfully completes and
returns a valid engine, ensuring failed binds do not disrupt the current running
task.
In `@internal/web/server.go`:
- Around line 1391-1401: The path traversal check in the handleListFiles
function uses strings.HasPrefix which has security vulnerabilities, allowing
sibling directory escapes and permitting all absolute paths if the active pwd is
empty. Store the result of s.activePwd() in a variable once at the start of the
function, then replace the strings.HasPrefix check with filepath.Rel to validate
that the absolute path is actually contained within the workspace directory. The
same pattern should also be applied to the other file endpoint mentioned at
lines 1441-1449.
- Line 71: The cfgMu mutex was introduced to serialize config saves but is not
being used consistently across all read-modify-write operations on s.cfg. Skill
toggling currently mutates and saves s.cfg under s.mu protection instead of
cfgMu, creating race conditions where concurrent approval-mode and skill updates
can overwrite each other. Identify all read-modify-write paths on s.cfg
(particularly skill toggling logic in the ranges around lines 2324-2331 and
2559-2581) and replace the s.mu locking with cfgMu locking for these operations
to ensure all config mutations are properly serialized.
- Around line 1278-1283: The fallback to the active handler should only occur
when the task_id is absent (empty string), not when it is non-empty but
unresolved. Currently, the code unconditionally falls back to activeHandler()
when resolveEngine() returns nil, which allows unknown task IDs to incorrectly
access the active task's handler. Add a guard condition that checks if
req.TaskID is empty before calling s.activeHandler() as a fallback. When
req.TaskID is non-empty but reng is nil, do not fall back to the active handler.
Apply this same guard pattern to all similar routing blocks in the approval/ask
pending endpoints and the WebSocket approval path (the locations also mentioned
in the comment).
- Around line 1174-1182: The eng.applyModeSwitch call at the end is executed
unconditionally even after rebuildForMode fails, which can leave the system in
an inconsistent state where the mode is switched but the agent rebuild failed.
Move the eng.applyModeSwitch call inside the if block where err == nil so that
the mode is only switched when the agent rebuild succeeds. Only apply the mode
switch when newAg is successfully populated by the rebuildForMode call, not when
it fails.
- Around line 951-958: The handleNewSession method ignores errors returned by
json.Decoder.Decode() by assigning to the blank identifier, allowing invalid
JSON payloads to proceed with zero-valued fields instead of being rejected.
Check the error returned from the Decode call on the json.NewDecoder, and if an
error occurs, write an appropriate HTTP error response (such as a 400 Bad
Request status) and return early, ensuring only valid JSON requests proceed to
create or focus a session.
- Around line 988-991: The error return value from the LoadSession function call
is being silently ignored with the underscore blank identifier, so when a stale
or nonexistent session ID is provided, the code proceeds without reporting the
failure. Instead of ignoring the error with underscore in the LoadSession call
within the session resume block, capture the error and check if it is non-nil,
then return an appropriate error response to the client before attempting to
call ReconstructState. This ensures that invalid or missing sessions are
properly reported rather than silently creating a new engine with empty history.
- Around line 2823-2840: After successfully creating the agent and applying the
model switch in the handler around the eng.applyModelSwitch() call, the code
only sets s.needsSetup to false but does not update the server configuration
state s.cfg with the new provider and model values from req.Provider and
req.Model. Update s.cfg within the mutex lock section to persist the Provider
and Model configuration so that endpoints like /api/models can reflect the newly
configured provider without requiring a server restart.
- Around line 851-868: In the activeEngine branch (the else block following the
eng != s.activeEngine() check), there is a race condition where the recorder is
being closed and cleared immediately after cancellation without waiting for the
run to finish draining. Before closing and clearing the recorder in the else
block where eng.recorder is checked and closed, add a mechanism to wait for any
ongoing drain operation to complete on the engine. This will prevent the runner
from attempting to write data while the recorder and history are being cleared,
eliminating the race between draining writes and the close/unlink operations.
---
Outside diff comments:
In `@internal/web/git.go`:
- Around line 18-29: The code calls s.activePwd() twice independently for the
listCmd and curCmd, which could result in the two git subprocesses querying
different repositories if the active workspace changes between the two calls.
Store the result of s.activePwd() in a single variable before setting the Dir
field for both the listCmd and curCmd command objects to ensure both commands
operate on the same workspace snapshot.
In `@internal/web/server.go`:
- Around line 1669-1694: The handleSessionDiff function reads sessionSnapshot
from the active engine while holding the lock, but then calls s.activePwd()
later outside that lock context. If the foreground task changes between these
calls, the snapshot could be from task A but the pwd from task B, causing a
mismatch. Capture the pwd from the same eng pointer within the lock block
(between eng.emu.Lock() and eng.emu.Unlock()), store it in a variable, and then
use that captured pwd value for cmd.Dir instead of calling s.activePwd() again
after the lock is released.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: f470bf72-a47b-4bea-9d78-12da83216c7a
📒 Files selected for processing (37)
internal/agent/budget.gointernal/agent/budget_test.gointernal/agent/compaction.gointernal/command/interactive.gointernal/command/web.gointernal/model/chatmodel.gointernal/model/registry.gointernal/model/token_usage_test.gointernal/runner/runner.gointernal/session/index_test.gointernal/session/session.gointernal/usage/event.gointernal/usage/stats.gointernal/usage/usage_test.gointernal/web/concurrency_test.gointernal/web/engine.gointernal/web/git.gointernal/web/goal_test.gointernal/web/mode_test.gointernal/web/pty.gointernal/web/remote.gointernal/web/server.gointernal/web/tasks_test.gointernal/web/usage.gointernal/web/usage_test.gointernal/web/ws.goweb/src/App.vueweb/src/components/UsageStatsPanel.vueweb/src/composables/api.tsweb/src/composables/ws.tsweb/src/i18n/locales/en.tsweb/src/i18n/locales/ja.tsweb/src/i18n/locales/ko.tsweb/src/i18n/locales/zh-Hans.tsweb/src/i18n/locales/zh-Hant.tsweb/src/stores/chat.tsweb/src/types/api.ts
CI lint (golangci-lint): - check registerEngine's error in the concurrency test (errcheck) - startPump: ctx-first parameter (revive) - nolint the deprecated adk.AgentMiddleware (langfuse/agent.NewAgent still use it) CodeRabbit findings: - budget: per-turn delta drives the token cap, but session-cumulative cost drives max_cost_per_session again (was overwritten with per-turn cost). - path guard: replace HasPrefix with filepath.Rel so /repo2 can't escape /repo and an empty workspace rejects everything (Critical). - approval/ask resolve + pending: a non-empty UNKNOWN task_id no longer falls back to the active task's handler-local ids; only an empty id maps to active. - mode / approval-mode switch: rebuild the agent first and abort on failure, so a plan-mode label can't sit over a still-write-capable agent; drop the over-zealous running-gate that silently dropped mid-run toggles. - delete active session: wait for the cancelled run to drain before closing the recorder (avoid a post-close truncating write). - handleNewSession: reject malformed bodies and stale resume ids instead of registering a phantom empty engine. - setup-complete: publish the saved cfg + rebuilt registry to live state so /api/models reflects the new provider without a restart. - cfgMu now guards skill-toggle's config save too. - project/remote switch: build the new engine before tearing down the old task's PTYs; snapshot the outgoing task once. - per-task createAgent caches the model only after the agent builds. Build, vet, full test, web -race, and frontend build all green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…atting The parallel-task refactor removed the global single-run gate for handleChat but left running-gates on the switch handlers, so switching to another task while one was chatting silently failed: the sidebar's cross-project openTask calls openProject -> handleSwitchProject, which returned 409 mid-run, and the frontend swallowed it (`if (!ok) return`). That blocked the core point of concurrent tasks. - handleSwitchProject / handleRemoteBind: drop the running gate — they build a NEW independent engine and leave the previous task running in the background. - handleSwitchModel: drop the running gate — applyModelSwitch swaps the agent under eng.emu and takes effect next turn, like the (already mid-run-capable) mode/approval switches. - Frontend: ws.ts drops events tagged with a different (backgrounded) task id via an activeTaskId() filter, so a task that keeps running after you switch away no longer pollutes the foreground view. Global (untagged) events and the first-message window (active id still unknown) pass through unchanged. handleGitCheckout / handleTruncateHistory keep their gate (they mutate the foreground running task's working tree / history). Build, vet, full test, and web -race all green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Three issues when switching tasks while one is chatting: 1. Cross-talk into a NEW chat. newSession()/resetToWelcomeAfterSwitch() blanked currentSessionId to '', so the ws activeTaskId filter had no active id and passed everything — a still-running previous task's thinking/output leaked into the fresh chat. Bind currentSessionId to the new/active task instead (newSession uses the engine id the backend returns; the switch path keeps the id fetchHealth set), and reset isRunning on new-chat/switch/resume so the composer doesn't show a stale "running". 2. No "running" marker on sessions. Backend now broadcasts a global task_status event on run start/stop and exposes a live `running` (+ `updated_at`) per task in GET /api/tasks (cross-referenced from live engines). Frontend tracks it (project store setTaskRunning + fetchAllTasks on status change) and the sidebar shows a pulsing dot + "running" label. 3. Sidebar time never changed. It showed created_at; now it shows updated_at || created_at, the backend bumps SessionMeta.UpdatedAt+Status each run, and the list re-sorts by recency so active tasks bubble up. Build, vet, full test, web -race, and frontend type-check/build all green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…icons Sidebar / tasks: - Filter·sort·group menu (status / project / last-activity / group-by / sort-by), persisted to localStorage. The sidebar list is now a filter→group→sort pipeline rendering either the project tree or flat date buckets (Today/Yesterday/…). - Ordering: workspaces and tasks surface running → unread → recency; running tasks pin to the top of their workspace. Running indicator is a breathing accent ring. - Per-workspace "+" starts a new task in that folder (switch + fresh welcome); deleting a folder's last conversation removes the now-empty folder. - Interrupt fixes: opening a backgrounded running task shows Stop; task_status keeps the composer's Stop/Send in sync with the viewed task; Stop targets the viewed task explicitly. The open conversation is never hidden by a filter. Branch / workspace / providers: - git checkout dirty-tree handling: a switch blocked by uncommitted work is reported non-destructively (blocked + at-risk files) so the UI can offer stash / discard; adds stash/force strategies (git.go, useBranch, BranchPicker). - POST /api/project/validate prunes workspaces whose path no longer exists. - Provider brand icons (ProviderIcon + utils/providerIcons, @lobehub icon set) replace the colored-initial tiles in the model picker. i18n: en, zh-Hans, zh-Hant, ja, ko. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
oxlint's unicorn/no-useless-spread flagged the `[...paths]` snapshot used to delete-while-iterating. Build the path Set directly instead — same behavior, no spread. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The SSH config step's saved-alias dropdown used Listbox/ListboxButton/ ListboxOptions/ListboxOption and ChevronUpDownIcon without importing them, so when any SSH alias existed (aliases.length > 0) the config form failed to render (blank step 2). strictTemplates is off, so type-check/build didn't catch the unresolved components. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Bundles the parallel-tasks work with a sidebar overhaul, safer branch switching, and provider brand icons.
Parallel tasks & usage (earlier commits)
Sidebar / tasks
task_statuskeeps the composer's Stop/Send in sync with the viewed task; Stop targets the viewed task explicitly. The open conversation is never hidden by a filter.Branch / workspace / providers
blocked+ at-risk files) so the UI can offer stash / discard; addsstash/forcestrategies (git.go,useBranch,BranchPicker).POST /api/project/validateprunes workspaces whose path no longer exists from the picker.ProviderIcon+utils/providerIcons,@lobehub/icons-static-svg) replace the colored-initial tiles in the model picker.i18n updated for en, zh-Hans, zh-Hant, ja, ko.
Verified:
vue-tsctype-check, eslint,vite build,go build,go vet, andgo test ./internal/web ./internal/sessionall pass.🤖 Generated with Claude Code