-
Notifications
You must be signed in to change notification settings - Fork 1
feat(usage): token usage statistics — global page, per-task context capacity, live cache hit rate #95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(usage): token usage statistics — global page, per-task context capacity, live cache hit rate #95
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,114 @@ | ||
| # Usage statistics | ||
|
|
||
| jcode records token usage across every surface (TUI, web, ACP) and exposes two | ||
| views in the web UI: | ||
|
|
||
| - **Global stats** — a "Usage" tab in Settings: tokens used, sessions, turns, | ||
| active days, current streak, most-used model, an activity heatmap, a daily | ||
| token trend, and per-model / per-project breakdowns. | ||
| - **Per-task context capacity** — a popover on the composer's token count: how | ||
| the current context window is split across Messages / System tools / MCP tools | ||
| / Skills / System prompt, plus the KV cache hit rate. | ||
|
|
||
| ## Data model | ||
|
|
||
| ### Token tracking (`internal/model`) | ||
|
|
||
| `model.TokenUsage` accumulates per-call usage. Each call is recorded via | ||
| `Add(AddParams{...})`, capturing: | ||
|
|
||
| | field | source (go-openai `Usage`) | | ||
| |---------------|--------------------------------------------------| | ||
| | Prompt | `PromptTokens` | | ||
| | Completion | `CompletionTokens` | | ||
| | Total | `TotalTokens` | | ||
| | Cached | `PromptTokensDetails.CachedTokens` (cache-read) | | ||
| | Reasoning | `CompletionTokensDetails.ReasoningTokens` | | ||
| | CacheWrite | always 0 — see below | | ||
|
|
||
| All providers go through one go-openai client. go-openai's `Usage` exposes | ||
| **cache-read** (`cached_tokens`) and **reasoning** tokens, but **not** | ||
| `cache_creation_input_tokens`. So `CacheWriteTokens` is reserved for a future | ||
| native transport and stays 0 today. | ||
|
|
||
| ### Cache hit rate | ||
|
|
||
| ``` | ||
| cache hit rate = Σ cached / Σ prompt (clamped to [0,1]) | ||
| ``` | ||
|
|
||
| i.e. the fraction of prompt tokens served from the provider's KV cache. This is | ||
| the only provider-portable definition given the wire constraint above. | ||
| `CacheObserved()` (any cached tokens seen) drives a "—" placeholder so 0% is not | ||
| confused with "this provider doesn't report caching". | ||
|
|
||
| ### Event log (`internal/usage`) | ||
|
|
||
| Global stats are persisted to an **append-only JSON-lines log** at | ||
| `~/.jcode/usage/events.jsonl`, one line per agent turn: | ||
|
|
||
| ```json | ||
| {"ts":1750531200,"date":"2026-06-21","project":"/path","session":"<uuid>","model":"glm-5.2","prompt":1500,"completion":300,"cached":1300,"reasoning":60,"total":1800,"calls":2} | ||
| ``` | ||
|
|
||
| Append-only `O_APPEND` writes are atomic for small records, so multiple jcode | ||
| processes (TUI + web + ACP) can record concurrently without a read-modify-write | ||
| race. All derived metrics (streak, active days, heatmap, per-model/project, | ||
| cache rate) are computed at read time by `usage.Aggregate`. | ||
|
|
||
| Token fields are per-turn **deltas**: the runner snapshots the cumulative | ||
| tracker at the start of a turn and records the difference at the end. Subagent | ||
| and teammate tokens are rolled into the same log under the **leader** session's | ||
| UUID so multi-agent work isn't undercounted. | ||
|
|
||
| The session **count** is sourced from the session index | ||
| (`session.ListAllSessions`), which is authoritative; the event log owns | ||
| token/day metrics. | ||
|
|
||
| ## API | ||
|
|
||
| | endpoint | returns | | ||
| |------------------------------|---------------------------------------------------| | ||
| | `GET /api/usage/stats?days=N`| global totals, streaks, heatmap (365d), trend (Nd), by-model, by-project | | ||
| | `GET /api/tasks/{id}/stats` | per-task context breakdown (active) or token rollup (historical) | | ||
| | `GET /api/status` | live token snapshot (extended with cache fields) | | ||
|
|
||
| The `token_update` WebSocket event carries the same per-turn token fields + | ||
| cache hit rate to the browser. | ||
|
|
||
| ## Per-task context breakdown | ||
|
|
||
| The five buckets are estimated at **~4 bytes/token** (`usage.Estimate`) — there | ||
| is no bundled tokenizer, and a relative breakdown only needs a consistent | ||
| heuristic (the UI labels it "estimated"): | ||
|
|
||
| 1. **System prompt** = estimate(systemPrompt) − estimate(skill descriptions) | ||
| 2. **System tools** = Σ estimate(tool JSON) over built-in tools | ||
| 3. **MCP tools** = Σ estimate(tool JSON) over MCP tools | ||
| 4. **Skills** = estimate(skill descriptions) | ||
| 5. **Messages** = max(0, lastPromptTokens − buckets 1-4) | ||
|
|
||
| The four static buckets are computed on demand from the live agent assembly | ||
| (`command/web.go`'s `breakdownFn`), which reads the captured `systemPrompt` / | ||
| `mcpTools` / `currentCM` / `skillLoader` by reference — so project switches and | ||
| MCP reloads are reflected with no cache to invalidate. The breakdown is only | ||
| meaningful for the **active** task; historical tasks return token totals + the | ||
| aggregate hit rate only (`is_active:false`). | ||
|
|
||
| ## Known limitations / future work | ||
|
|
||
| - **No `cache_creation` accounting** — blocked by the shared go-openai transport. | ||
| A native Anthropic transport could populate `CacheWriteTokens`. | ||
| - **Cost is not yet derived** — `registry.go`'s `ModelCost` | ||
| (Input/Output/CacheRead/CacheWrite) is not multiplied into the stats. A future | ||
| pass could price each event for a spend view. | ||
| - **Per-turn delta across process restart** — a turn that resumes in a new | ||
| process loses the in-memory start snapshot and may mis-count once. | ||
|
|
||
| ## Testing | ||
|
|
||
| Per the sandbox constraints (live servers can't bind sockets), the backend is | ||
| covered by in-process `httptest` (`internal/web/usage_test.go`) and unit tests | ||
| for aggregation/streaks (`internal/usage/usage_test.go`) and the token struct | ||
| (`internal/model/token_usage_test.go`). The frontend is verified via | ||
| `vue-tsc` + `vite build`. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -2,6 +2,7 @@ package command | |
|
|
||
| import ( | ||
| "context" | ||
| "encoding/json" | ||
| "fmt" | ||
| "os/signal" | ||
| "path/filepath" | ||
|
|
@@ -29,10 +30,29 @@ import ( | |
| "github.com/cnjack/jcode/internal/skills" | ||
| "github.com/cnjack/jcode/internal/telemetry" | ||
| "github.com/cnjack/jcode/internal/tools" | ||
| "github.com/cnjack/jcode/internal/usage" | ||
| util "github.com/cnjack/jcode/internal/util" | ||
| "github.com/cnjack/jcode/internal/web" | ||
| ) | ||
|
|
||
| // estimateToolTokens approximates a tool's contribution to the context window | ||
| // from its serialized schema (name + description + parameters). ToolInfo's | ||
| // MarshalJSON includes the JSON-schema params, so one marshal captures it all. | ||
| func estimateToolTokens(ctx context.Context, t tool.BaseTool) int { | ||
| if t == nil { | ||
| return 0 | ||
| } | ||
| info, err := t.Info(ctx) | ||
| if err != nil || info == nil { | ||
| return 0 | ||
| } | ||
| raw, err := json.Marshal(info) | ||
| if err != nil { | ||
| return usage.EstimateBytes(len(info.Name) + len(info.Desc)) | ||
| } | ||
| return usage.EstimateBytes(len(raw)) | ||
| } | ||
|
|
||
| func NewWebCmd() *cobra.Command { | ||
| var port int | ||
| var host string | ||
|
|
@@ -421,6 +441,37 @@ func runWebServer(port int, host string, openBrowser bool) error { | |
| return newAg, newRec, nil | ||
| } | ||
|
|
||
| // breakdownFn estimates how the live agent's context window is partitioned | ||
| // across system prompt / built-in tools / MCP tools / skills. It reads the | ||
| // captured assembly variables (systemPrompt, mcpTools, currentCM, skillLoader) | ||
| // by reference, so project switches and MCP reloads are reflected without any | ||
| // cache to invalidate. Built-in tools = all tools minus MCP tools. | ||
| breakdownFn := func() usage.ContextBreakdown { | ||
| var b usage.ContextBreakdown | ||
| skillDesc := skillLoader.Descriptions() | ||
| b.SkillsTokens = usage.Estimate(skillDesc) | ||
| // Skills are injected into the system prompt, so subtract to avoid | ||
| // double-counting them in the system-prompt bucket. | ||
| b.SystemPromptTokens = usage.Estimate(systemPrompt) - b.SkillsTokens | ||
| if b.SystemPromptTokens < 0 { | ||
| b.SystemPromptTokens = 0 | ||
| } | ||
| for _, mt := range mcpTools { | ||
| b.MCPToolsTokens += estimateToolTokens(ctx, mt) | ||
| } | ||
| if currentCM != nil { | ||
| total := 0 | ||
| for _, at := range buildAllTools(currentCM) { | ||
| total += estimateToolTokens(ctx, at) | ||
| } | ||
| b.SystemToolsTokens = total - b.MCPToolsTokens | ||
| if b.SystemToolsTokens < 0 { | ||
| b.SystemToolsTokens = 0 | ||
| } | ||
| } | ||
| return b | ||
| } | ||
|
Comment on lines
+444
to
+473
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
This closure reads 🤖 Prompt for AI Agents |
||
|
|
||
| srv := web.NewServer(&web.ServerConfig{ | ||
| Port: port, | ||
| Host: host, | ||
|
|
@@ -450,6 +501,7 @@ func runWebServer(port int, host string, openBrowser bool) error { | |
| EventHandler: finalHandler, | ||
| NeedsSetup: needsSetup, | ||
| TokenUsage: agentTokenUsage, | ||
| ContextBreakdownFn: breakdownFn, | ||
| }) | ||
|
|
||
| // Set handler for approval routing. | ||
|
|
||
| Original file line number | Diff line number | Diff line change | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -449,3 +449,22 @@ func SessionsIndexPath() (string, error) { | |||||||||||||||||
| } | ||||||||||||||||||
| return filepath.Join(dir, "session.json"), nil | ||||||||||||||||||
| } | ||||||||||||||||||
|
|
||||||||||||||||||
| // UsageDir returns the path to the usage-statistics directory (~/.jcode/usage). | ||||||||||||||||||
| func UsageDir() (string, error) { | ||||||||||||||||||
| home, err := os.UserHomeDir() | ||||||||||||||||||
| if err != nil { | ||||||||||||||||||
| return "", fmt.Errorf("failed to get home directory: %w", err) | ||||||||||||||||||
| } | ||||||||||||||||||
|
Comment on lines
+455
to
+458
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Use the project-standard wrapped error prefix. Line [457] uses a free-form wrapped error string; use the repo convention Suggested change- return "", fmt.Errorf("failed to get home directory: %w", err)
+ return "", fmt.Errorf("usage_dir: %w", err)As per coding guidelines, 📝 Committable suggestion
Suggested change
🤖 Prompt for AI AgentsSource: Coding guidelines |
||||||||||||||||||
| return filepath.Join(home, configDir, "usage"), nil | ||||||||||||||||||
| } | ||||||||||||||||||
|
|
||||||||||||||||||
| // UsageEventsPath returns the path to the append-only usage event log | ||||||||||||||||||
| // (~/.jcode/usage/events.jsonl), one JSON line per recorded agent turn. | ||||||||||||||||||
| func UsageEventsPath() (string, error) { | ||||||||||||||||||
| dir, err := UsageDir() | ||||||||||||||||||
| if err != nil { | ||||||||||||||||||
| return "", err | ||||||||||||||||||
| } | ||||||||||||||||||
| return filepath.Join(dir, "events.jsonl"), nil | ||||||||||||||||||
| } | ||||||||||||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Context breakdown currently misreports in plan mode.
breakdownFnalways estimates fromsystemPrompt+buildAllTools(currentCM), but the active agent in plan mode usesplanPrompt+buildPlanTools(). This overstates static context usage and can skewmessages_tokens/capacity UI while in plan mode.Suggested fix
breakdownFn := func() usage.ContextBreakdown { var b usage.ContextBreakdown + prompt := systemPrompt + var toolsForMode []tool.BaseTool + if currentCM != nil { + if currentPlanMode { + prompt = planPrompt + toolsForMode = buildPlanTools() + } else { + toolsForMode = buildAllTools(currentCM) + } + } skillDesc := skillLoader.Descriptions() - b.SkillsTokens = usage.Estimate(skillDesc) + if !currentPlanMode { + b.SkillsTokens = usage.Estimate(skillDesc) + } - b.SystemPromptTokens = usage.Estimate(systemPrompt) - b.SkillsTokens + b.SystemPromptTokens = usage.Estimate(prompt) - b.SkillsTokens if b.SystemPromptTokens < 0 { b.SystemPromptTokens = 0 } - for _, mt := range mcpTools { - b.MCPToolsTokens += estimateToolTokens(ctx, mt) - } - if currentCM != nil { + if !currentPlanMode { + for _, mt := range mcpTools { + b.MCPToolsTokens += estimateToolTokens(ctx, mt) + } + } + if len(toolsForMode) > 0 { total := 0 - for _, at := range buildAllTools(currentCM) { + for _, at := range toolsForMode { total += estimateToolTokens(ctx, at) } b.SystemToolsTokens = total - b.MCPToolsTokens if b.SystemToolsTokens < 0 { b.SystemToolsTokens = 0 } } return b }📝 Committable suggestion
🤖 Prompt for AI Agents