Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
114 changes: 114 additions & 0 deletions docs/usage-stats.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
# Usage statistics

jcode records token usage across every surface (TUI, web, ACP) and exposes two
views in the web UI:

- **Global stats** — a "Usage" tab in Settings: tokens used, sessions, turns,
active days, current streak, most-used model, an activity heatmap, a daily
token trend, and per-model / per-project breakdowns.
- **Per-task context capacity** — a popover on the composer's token count: how
the current context window is split across Messages / System tools / MCP tools
/ Skills / System prompt, plus the KV cache hit rate.

## Data model

### Token tracking (`internal/model`)

`model.TokenUsage` accumulates per-call usage. Each call is recorded via
`Add(AddParams{...})`, capturing:

| field | source (go-openai `Usage`) |
|---------------|--------------------------------------------------|
| Prompt | `PromptTokens` |
| Completion | `CompletionTokens` |
| Total | `TotalTokens` |
| Cached | `PromptTokensDetails.CachedTokens` (cache-read) |
| Reasoning | `CompletionTokensDetails.ReasoningTokens` |
| CacheWrite | always 0 — see below |

All providers go through one go-openai client. go-openai's `Usage` exposes
**cache-read** (`cached_tokens`) and **reasoning** tokens, but **not**
`cache_creation_input_tokens`. So `CacheWriteTokens` is reserved for a future
native transport and stays 0 today.

### Cache hit rate

```
cache hit rate = Σ cached / Σ prompt (clamped to [0,1])
```

i.e. the fraction of prompt tokens served from the provider's KV cache. This is
the only provider-portable definition given the wire constraint above.
`CacheObserved()` (any cached tokens seen) drives a "—" placeholder so 0% is not
confused with "this provider doesn't report caching".

### Event log (`internal/usage`)

Global stats are persisted to an **append-only JSON-lines log** at
`~/.jcode/usage/events.jsonl`, one line per agent turn:

```json
{"ts":1750531200,"date":"2026-06-21","project":"/path","session":"<uuid>","model":"glm-5.2","prompt":1500,"completion":300,"cached":1300,"reasoning":60,"total":1800,"calls":2}
```

Append-only `O_APPEND` writes are atomic for small records, so multiple jcode
processes (TUI + web + ACP) can record concurrently without a read-modify-write
race. All derived metrics (streak, active days, heatmap, per-model/project,
cache rate) are computed at read time by `usage.Aggregate`.

Token fields are per-turn **deltas**: the runner snapshots the cumulative
tracker at the start of a turn and records the difference at the end. Subagent
and teammate tokens are rolled into the same log under the **leader** session's
UUID so multi-agent work isn't undercounted.

The session **count** is sourced from the session index
(`session.ListAllSessions`), which is authoritative; the event log owns
token/day metrics.

## API

| endpoint | returns |
|------------------------------|---------------------------------------------------|
| `GET /api/usage/stats?days=N`| global totals, streaks, heatmap (365d), trend (Nd), by-model, by-project |
| `GET /api/tasks/{id}/stats` | per-task context breakdown (active) or token rollup (historical) |
| `GET /api/status` | live token snapshot (extended with cache fields) |

The `token_update` WebSocket event carries the same per-turn token fields +
cache hit rate to the browser.

## Per-task context breakdown

The five buckets are estimated at **~4 bytes/token** (`usage.Estimate`) — there
is no bundled tokenizer, and a relative breakdown only needs a consistent
heuristic (the UI labels it "estimated"):

1. **System prompt** = estimate(systemPrompt) − estimate(skill descriptions)
2. **System tools** = Σ estimate(tool JSON) over built-in tools
3. **MCP tools** = Σ estimate(tool JSON) over MCP tools
4. **Skills** = estimate(skill descriptions)
5. **Messages** = max(0, lastPromptTokens − buckets 1-4)

The four static buckets are computed on demand from the live agent assembly
(`command/web.go`'s `breakdownFn`), which reads the captured `systemPrompt` /
`mcpTools` / `currentCM` / `skillLoader` by reference — so project switches and
MCP reloads are reflected with no cache to invalidate. The breakdown is only
meaningful for the **active** task; historical tasks return token totals + the
aggregate hit rate only (`is_active:false`).

## Known limitations / future work

- **No `cache_creation` accounting** — blocked by the shared go-openai transport.
A native Anthropic transport could populate `CacheWriteTokens`.
- **Cost is not yet derived** — `registry.go`'s `ModelCost`
(Input/Output/CacheRead/CacheWrite) is not multiplied into the stats. A future
pass could price each event for a spend view.
- **Per-turn delta across process restart** — a turn that resumes in a new
process loses the in-memory start snapshot and may mis-count once.

## Testing

Per the sandbox constraints (live servers can't bind sockets), the backend is
covered by in-process `httptest` (`internal/web/usage_test.go`) and unit tests
for aggregation/streaks (`internal/usage/usage_test.go`) and the token struct
(`internal/model/token_usage_test.go`). The frontend is verified via
`vue-tsc` + `vite build`.
52 changes: 52 additions & 0 deletions internal/command/web.go
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ package command

import (
"context"
"encoding/json"
"fmt"
"os/signal"
"path/filepath"
Expand Down Expand Up @@ -29,10 +30,29 @@ import (
"github.com/cnjack/jcode/internal/skills"
"github.com/cnjack/jcode/internal/telemetry"
"github.com/cnjack/jcode/internal/tools"
"github.com/cnjack/jcode/internal/usage"
util "github.com/cnjack/jcode/internal/util"
"github.com/cnjack/jcode/internal/web"
)

// estimateToolTokens approximates a tool's contribution to the context window
// from its serialized schema (name + description + parameters). ToolInfo's
// MarshalJSON includes the JSON-schema params, so one marshal captures it all.
func estimateToolTokens(ctx context.Context, t tool.BaseTool) int {
if t == nil {
return 0
}
info, err := t.Info(ctx)
if err != nil || info == nil {
return 0
}
raw, err := json.Marshal(info)
if err != nil {
return usage.EstimateBytes(len(info.Name) + len(info.Desc))
}
return usage.EstimateBytes(len(raw))
}

func NewWebCmd() *cobra.Command {
var port int
var host string
Expand Down Expand Up @@ -421,6 +441,37 @@ func runWebServer(port int, host string, openBrowser bool) error {
return newAg, newRec, nil
}

// breakdownFn estimates how the live agent's context window is partitioned
// across system prompt / built-in tools / MCP tools / skills. It reads the
// captured assembly variables (systemPrompt, mcpTools, currentCM, skillLoader)
// by reference, so project switches and MCP reloads are reflected without any
// cache to invalidate. Built-in tools = all tools minus MCP tools.
breakdownFn := func() usage.ContextBreakdown {
var b usage.ContextBreakdown
skillDesc := skillLoader.Descriptions()
b.SkillsTokens = usage.Estimate(skillDesc)
// Skills are injected into the system prompt, so subtract to avoid
// double-counting them in the system-prompt bucket.
b.SystemPromptTokens = usage.Estimate(systemPrompt) - b.SkillsTokens
if b.SystemPromptTokens < 0 {
b.SystemPromptTokens = 0
}
for _, mt := range mcpTools {
b.MCPToolsTokens += estimateToolTokens(ctx, mt)
}
if currentCM != nil {
total := 0
for _, at := range buildAllTools(currentCM) {
total += estimateToolTokens(ctx, at)
}
b.SystemToolsTokens = total - b.MCPToolsTokens
if b.SystemToolsTokens < 0 {
b.SystemToolsTokens = 0
}
}
Comment on lines +449 to +471

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Context breakdown currently misreports in plan mode.

breakdownFn always estimates from systemPrompt + buildAllTools(currentCM), but the active agent in plan mode uses planPrompt + buildPlanTools(). This overstates static context usage and can skew messages_tokens/capacity UI while in plan mode.

Suggested fix
 breakdownFn := func() usage.ContextBreakdown {
   var b usage.ContextBreakdown
+  prompt := systemPrompt
+  var toolsForMode []tool.BaseTool
+  if currentCM != nil {
+    if currentPlanMode {
+      prompt = planPrompt
+      toolsForMode = buildPlanTools()
+    } else {
+      toolsForMode = buildAllTools(currentCM)
+    }
+  }

   skillDesc := skillLoader.Descriptions()
-  b.SkillsTokens = usage.Estimate(skillDesc)
+  if !currentPlanMode {
+    b.SkillsTokens = usage.Estimate(skillDesc)
+  }

-  b.SystemPromptTokens = usage.Estimate(systemPrompt) - b.SkillsTokens
+  b.SystemPromptTokens = usage.Estimate(prompt) - b.SkillsTokens
   if b.SystemPromptTokens < 0 { b.SystemPromptTokens = 0 }

-  for _, mt := range mcpTools {
-    b.MCPToolsTokens += estimateToolTokens(ctx, mt)
-  }
-  if currentCM != nil {
+  if !currentPlanMode {
+    for _, mt := range mcpTools {
+      b.MCPToolsTokens += estimateToolTokens(ctx, mt)
+    }
+  }
+  if len(toolsForMode) > 0 {
     total := 0
-    for _, at := range buildAllTools(currentCM) {
+    for _, at := range toolsForMode {
       total += estimateToolTokens(ctx, at)
     }
     b.SystemToolsTokens = total - b.MCPToolsTokens
     if b.SystemToolsTokens < 0 { b.SystemToolsTokens = 0 }
   }
   return b
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
breakdownFn := func() usage.ContextBreakdown {
var b usage.ContextBreakdown
skillDesc := skillLoader.Descriptions()
b.SkillsTokens = usage.Estimate(skillDesc)
// Skills are injected into the system prompt, so subtract to avoid
// double-counting them in the system-prompt bucket.
b.SystemPromptTokens = usage.Estimate(systemPrompt) - b.SkillsTokens
if b.SystemPromptTokens < 0 {
b.SystemPromptTokens = 0
}
for _, mt := range mcpTools {
b.MCPToolsTokens += estimateToolTokens(ctx, mt)
}
if currentCM != nil {
total := 0
for _, at := range buildAllTools(currentCM) {
total += estimateToolTokens(ctx, at)
}
b.SystemToolsTokens = total - b.MCPToolsTokens
if b.SystemToolsTokens < 0 {
b.SystemToolsTokens = 0
}
}
breakdownFn := func() usage.ContextBreakdown {
var b usage.ContextBreakdown
prompt := systemPrompt
var toolsForMode []tool.BaseTool
if currentCM != nil {
if currentPlanMode {
prompt = planPrompt
toolsForMode = buildPlanTools()
} else {
toolsForMode = buildAllTools(currentCM)
}
}
skillDesc := skillLoader.Descriptions()
if !currentPlanMode {
b.SkillsTokens = usage.Estimate(skillDesc)
}
// Skills are injected into the system prompt, so subtract to avoid
// double-counting them in the system-prompt bucket.
b.SystemPromptTokens = usage.Estimate(prompt) - b.SkillsTokens
if b.SystemPromptTokens < 0 {
b.SystemPromptTokens = 0
}
if !currentPlanMode {
for _, mt := range mcpTools {
b.MCPToolsTokens += estimateToolTokens(ctx, mt)
}
}
if len(toolsForMode) > 0 {
total := 0
for _, at := range toolsForMode {
total += estimateToolTokens(ctx, at)
}
b.SystemToolsTokens = total - b.MCPToolsTokens
if b.SystemToolsTokens < 0 {
b.SystemToolsTokens = 0
}
}
return b
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/command/web.go` around lines 449 - 471, The breakdownFn function
always estimates context tokens using systemPrompt and buildAllTools(currentCM),
but it should dynamically select between systemPrompt with
buildAllTools(currentCM) for normal mode versus planPrompt with buildPlanTools()
for plan mode. Update the breakdownFn function to check if plan mode is active
and use the appropriate prompt and tools building function to accurately reflect
the actual context being used by the agent in the current mode. This will ensure
b.SystemPromptTokens and b.SystemToolsTokens are correctly calculated regardless
of whether plan mode is enabled.

return b
}
Comment on lines +444 to +473

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

breakdownFn reads shared mutable state without synchronization.

This closure reads systemPrompt, mcpTools, currentCM, and mode-dependent tool composition while other request paths mutate those values (project switch, MCP reload, agent rebuild). In the web server’s concurrent request model, this is a data-race risk and can return torn/unstable breakdowns.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/command/web.go` around lines 444 - 473, The breakdownFn closure
reads shared mutable state (systemPrompt, mcpTools, currentCM, and skillLoader)
without synchronization, creating a data race risk in the concurrent web server
environment. Protect access to these shared variables by acquiring an
appropriate synchronization lock (such as a mutex) before reading them within
the breakdownFn function, and release the lock after gathering all necessary
data for the breakdown calculation. Ensure that the entire set of reads happens
atomically while holding the lock to prevent torn or inconsistent state from
being observed during concurrent project switches or MCP reloads.


srv := web.NewServer(&web.ServerConfig{
Port: port,
Host: host,
Expand Down Expand Up @@ -450,6 +501,7 @@ func runWebServer(port int, host string, openBrowser bool) error {
EventHandler: finalHandler,
NeedsSetup: needsSetup,
TokenUsage: agentTokenUsage,
ContextBreakdownFn: breakdownFn,
})

// Set handler for approval routing.
Expand Down
19 changes: 19 additions & 0 deletions internal/config/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -449,3 +449,22 @@ func SessionsIndexPath() (string, error) {
}
return filepath.Join(dir, "session.json"), nil
}

// UsageDir returns the path to the usage-statistics directory (~/.jcode/usage).
func UsageDir() (string, error) {
home, err := os.UserHomeDir()
if err != nil {
return "", fmt.Errorf("failed to get home directory: %w", err)
}
Comment on lines +455 to +458

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use the project-standard wrapped error prefix.

Line [457] uses a free-form wrapped error string; use the repo convention fmt.Errorf("tool_name: %w", err) for non-tool code.

Suggested change
-		return "", fmt.Errorf("failed to get home directory: %w", err)
+		return "", fmt.Errorf("usage_dir: %w", err)

As per coding guidelines, **/*.go: Use fmt.Errorf("tool_name: %w", err) for wrapped errors in non-tool code.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
home, err := os.UserHomeDir()
if err != nil {
return "", fmt.Errorf("failed to get home directory: %w", err)
}
home, err := os.UserHomeDir()
if err != nil {
return "", fmt.Errorf("usage_dir: %w", err)
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/config/config.go` around lines 455 - 458, The fmt.Errorf call in the
error handling block after os.UserHomeDir() is not following the project
convention for wrapped errors. Replace the current free-form error message
"failed to get home directory: %w" with the standard format "tool_name: %w"
(where tool_name is the appropriate identifier for this module), maintaining the
wrapped error pattern with %w and the captured err variable.

Source: Coding guidelines

return filepath.Join(home, configDir, "usage"), nil
}

// UsageEventsPath returns the path to the append-only usage event log
// (~/.jcode/usage/events.jsonl), one JSON line per recorded agent turn.
func UsageEventsPath() (string, error) {
dir, err := UsageDir()
if err != nil {
return "", err
}
return filepath.Join(dir, "events.jsonl"), nil
}
21 changes: 20 additions & 1 deletion internal/handler/handler.go
Original file line number Diff line number Diff line change
Expand Up @@ -52,9 +52,28 @@ type AgentEventHandler interface {
RequestApproval(ctx context.Context, req ApprovalRequest) (ApprovalResponse, error)
}

// TokenUsage carries token usage info.
// TokenUsage carries token usage info to the UI surfaces.
//
// TotalTokens is the LAST call's total — i.e. current context-window
// occupancy, used to drive the context-usage bar. The remaining token counters
// (Prompt/Completion/Cached/Reasoning/CacheWrite/CallCount) are CUMULATIVE for
// the run's tracker, and CacheHitRate is the cumulative cached/prompt ratio.
// CacheSupported is false when the provider never reported any cached tokens,
// so the UI can show "—" instead of a misleading 0%.
//
// NOTE: the field order/types here must stay identical to WebTokenData
// (internal/handler/web.go) so OnTokenUpdate's direct struct conversion keeps
// compiling.
type TokenUsage struct {
TotalTokens int64
PromptTokens int64
CompletionTokens int64
CachedTokens int64
ReasoningTokens int64
CacheWriteTokens int64
CallCount int64
CacheHitRate float64
CacheSupported bool
ModelContextLimit int // 0 if unknown
}

Expand Down
17 changes: 14 additions & 3 deletions internal/handler/web.go
Original file line number Diff line number Diff line change
Expand Up @@ -224,10 +224,21 @@ type WebToolResultData struct {
ToolCallID string `json:"tool_call_id,omitempty"`
}

// WebTokenData carries token usage.
// WebTokenData carries token usage to the browser. Field order/types MUST match
// handler.TokenUsage so OnTokenUpdate's WebTokenData(info) conversion compiles.
// total_tokens is current context occupancy (last call); the rest are
// cumulative for the session.
type WebTokenData struct {
TotalTokens int64 `json:"total_tokens"`
ModelContextLimit int `json:"model_context_limit"`
TotalTokens int64 `json:"total_tokens"`
PromptTokens int64 `json:"prompt_tokens"`
CompletionTokens int64 `json:"completion_tokens"`
CachedTokens int64 `json:"cached_tokens"`
ReasoningTokens int64 `json:"reasoning_tokens"`
CacheWriteTokens int64 `json:"cache_write_tokens"`
CallCount int64 `json:"call_count"`
CacheHitRate float64 `json:"cache_hit_rate"`
CacheSupported bool `json:"cache_supported"`
ModelContextLimit int `json:"model_context_limit"`
}

// WebSubagentData carries subagent lifecycle events.
Expand Down
Loading
Loading