diff --git a/docs/usage-stats.md b/docs/usage-stats.md
new file mode 100644
index 0000000..848e23c
--- /dev/null
+++ b/docs/usage-stats.md
@@ -0,0 +1,114 @@
+# Usage statistics
+
+jcode records token usage across every surface (TUI, web, ACP) and exposes two
+views in the web UI:
+
+- **Global stats** — a "Usage" tab in Settings: tokens used, sessions, turns,
+  active days, current streak, most-used model, an activity heatmap, a daily
+  token trend, and per-model / per-project breakdowns.
+- **Per-task context capacity** — a popover on the composer's token count: how
+  the current context window is split across Messages / System tools / MCP tools
+  / Skills / System prompt, plus the KV cache hit rate.
+
+## Data model
+
+### Token tracking (`internal/model`)
+
+`model.TokenUsage` accumulates per-call usage. Each call is recorded via
+`Add(AddParams{...})`, capturing:
+
+| field         | source (go-openai `Usage`)                       |
+|---------------|--------------------------------------------------|
+| Prompt        | `PromptTokens`                                   |
+| Completion    | `CompletionTokens`                               |
+| Total         | `TotalTokens`                                    |
+| Cached        | `PromptTokensDetails.CachedTokens` (cache-read)  |
+| Reasoning     | `CompletionTokensDetails.ReasoningTokens`        |
+| CacheWrite    | always 0 — see below                             |
+
+All providers go through one go-openai client. go-openai's `Usage` exposes
+**cache-read** (`cached_tokens`) and **reasoning** tokens, but **not**
+`cache_creation_input_tokens`. So `CacheWriteTokens` is reserved for a future
+native transport and stays 0 today.
+
+### Cache hit rate
+
+```
+cache hit rate = Σ cached / Σ prompt        (clamped to [0,1])
+```
+
+i.e. the fraction of prompt tokens served from the provider's KV cache. This is
+the only provider-portable definition given the wire constraint above.
+`CacheObserved()` (any cached tokens seen) drives a "—" placeholder so 0% is not
+confused with "this provider doesn't report caching".
+
+### Event log (`internal/usage`)
+
+Global stats are persisted to an **append-only JSON-lines log** at
+`~/.jcode/usage/events.jsonl`, one line per agent turn:
+
+```json
+{"ts":1750531200,"date":"2026-06-21","project":"/path","session":"<uuid>","model":"glm-5.2","prompt":1500,"completion":300,"cached":1300,"reasoning":60,"total":1800,"calls":2}
+```
+
+Append-only `O_APPEND` writes are atomic for small records, so multiple jcode
+processes (TUI + web + ACP) can record concurrently without a read-modify-write
+race. All derived metrics (streak, active days, heatmap, per-model/project,
+cache rate) are computed at read time by `usage.Aggregate`.
+
+Token fields are per-turn **deltas**: the runner snapshots the cumulative
+tracker at the start of a turn and records the difference at the end. Subagent
+and teammate tokens are rolled into the same log under the **leader** session's
+UUID so multi-agent work isn't undercounted.
+
+The session **count** is sourced from the session index
+(`session.ListAllSessions`), which is authoritative; the event log owns
+token/day metrics.
+
+## API
+
+| endpoint                     | returns                                           |
+|------------------------------|---------------------------------------------------|
+| `GET /api/usage/stats?days=N`| global totals, streaks, heatmap (365d), trend (Nd), by-model, by-project |
+| `GET /api/tasks/{id}/stats`  | per-task context breakdown (active) or token rollup (historical) |
+| `GET /api/status`            | live token snapshot (extended with cache fields)  |
+
+The `token_update` WebSocket event carries the same per-turn token fields +
+cache hit rate to the browser.
+
+## Per-task context breakdown
+
+The five buckets are estimated at **~4 bytes/token** (`usage.Estimate`) — there
+is no bundled tokenizer, and a relative breakdown only needs a consistent
+heuristic (the UI labels it "estimated"):
+
+1. **System prompt** = estimate(systemPrompt) − estimate(skill descriptions)
+2. **System tools** = Σ estimate(tool JSON) over built-in tools
+3. **MCP tools** = Σ estimate(tool JSON) over MCP tools
+4. **Skills** = estimate(skill descriptions)
+5. **Messages** = max(0, lastPromptTokens − buckets 1-4)
+
+The four static buckets are computed on demand from the live agent assembly
+(`command/web.go`'s `breakdownFn`), which reads the captured `systemPrompt` /
+`mcpTools` / `currentCM` / `skillLoader` by reference — so project switches and
+MCP reloads are reflected with no cache to invalidate. The breakdown is only
+meaningful for the **active** task; historical tasks return token totals + the
+aggregate hit rate only (`is_active:false`).
+
+## Known limitations / future work
+
+- **No `cache_creation` accounting** — blocked by the shared go-openai transport.
+  A native Anthropic transport could populate `CacheWriteTokens`.
+- **Cost is not yet derived** — `registry.go`'s `ModelCost`
+  (Input/Output/CacheRead/CacheWrite) is not multiplied into the stats. A future
+  pass could price each event for a spend view.
+- **Per-turn delta across process restart** — a turn that resumes in a new
+  process loses the in-memory start snapshot and may mis-count once.
+
+## Testing
+
+Per the sandbox constraints (live servers can't bind sockets), the backend is
+covered by in-process `httptest` (`internal/web/usage_test.go`) and unit tests
+for aggregation/streaks (`internal/usage/usage_test.go`) and the token struct
+(`internal/model/token_usage_test.go`). The frontend is verified via
+`vue-tsc` + `vite build`.
diff --git a/internal/command/web.go b/internal/command/web.go
index 3bd0530..887abdc 100644
--- a/internal/command/web.go
+++ b/internal/command/web.go
@@ -2,6 +2,7 @@ package command
 
 import (
 	"context"
+	"encoding/json"
 	"fmt"
 	"os/signal"
 	"path/filepath"
@@ -29,10 +30,29 @@ import (
 	"github.com/cnjack/jcode/internal/skills"
 	"github.com/cnjack/jcode/internal/telemetry"
 	"github.com/cnjack/jcode/internal/tools"
+	"github.com/cnjack/jcode/internal/usage"
 	util "github.com/cnjack/jcode/internal/util"
 	"github.com/cnjack/jcode/internal/web"
 )
 
+// estimateToolTokens approximates a tool's contribution to the context window
+// from its serialized schema (name + description + parameters). ToolInfo's
+// MarshalJSON includes the JSON-schema params, so one marshal captures it all.
+func estimateToolTokens(ctx context.Context, t tool.BaseTool) int {
+	if t == nil {
+		return 0
+	}
+	info, err := t.Info(ctx)
+	if err != nil || info == nil {
+		return 0
+	}
+	raw, err := json.Marshal(info)
+	if err != nil {
+		return usage.EstimateBytes(len(info.Name) + len(info.Desc))
+	}
+	return usage.EstimateBytes(len(raw))
+}
+
 func NewWebCmd() *cobra.Command {
 	var port int
 	var host string
@@ -421,6 +441,37 @@ func runWebServer(port int, host string, openBrowser bool) error {
 		return newAg, newRec, nil
 	}
 
+	// breakdownFn estimates how the live agent's context window is partitioned
+	// across system prompt / built-in tools / MCP tools / skills. It reads the
+	// captured assembly variables (systemPrompt, mcpTools, currentCM, skillLoader)
+	// by reference, so project switches and MCP reloads are reflected without any
+	// cache to invalidate. Built-in tools = all tools minus MCP tools.
+	breakdownFn := func() usage.ContextBreakdown {
+		var b usage.ContextBreakdown
+		skillDesc := skillLoader.Descriptions()
+		b.SkillsTokens = usage.Estimate(skillDesc)
+		// Skills are injected into the system prompt, so subtract to avoid
+		// double-counting them in the system-prompt bucket.
+		b.SystemPromptTokens = usage.Estimate(systemPrompt) - b.SkillsTokens
+		if b.SystemPromptTokens < 0 {
+			b.SystemPromptTokens = 0
+		}
+		for _, mt := range mcpTools {
+			b.MCPToolsTokens += estimateToolTokens(ctx, mt)
+		}
+		if currentCM != nil {
+			total := 0
+			for _, at := range buildAllTools(currentCM) {
+				total += estimateToolTokens(ctx, at)
+			}
+			b.SystemToolsTokens = total - b.MCPToolsTokens
+			if b.SystemToolsTokens < 0 {
+				b.SystemToolsTokens = 0
+			}
+		}
+		return b
+	}
+
 	srv := web.NewServer(&web.ServerConfig{
 		Port:               port,
 		Host:               host,
@@ -450,6 +501,7 @@ func runWebServer(port int, host string, openBrowser bool) error {
 		EventHandler:       finalHandler,
 		NeedsSetup:         needsSetup,
 		TokenUsage:         agentTokenUsage,
+		ContextBreakdownFn: breakdownFn,
 	})
 
 	// Set handler for approval routing.
diff --git a/internal/config/config.go b/internal/config/config.go
index 8feac32..734c735 100644
--- a/internal/config/config.go
+++ b/internal/config/config.go
@@ -449,3 +449,22 @@ func SessionsIndexPath() (string, error) {
 	}
 	return filepath.Join(dir, "session.json"), nil
 }
+
+// UsageDir returns the path to the usage-statistics directory (~/.jcode/usage).
+func UsageDir() (string, error) {
+	home, err := os.UserHomeDir()
+	if err != nil {
+		return "", fmt.Errorf("failed to get home directory: %w", err)
+	}
+	return filepath.Join(home, configDir, "usage"), nil
+}
+
+// UsageEventsPath returns the path to the append-only usage event log
+// (~/.jcode/usage/events.jsonl), one JSON line per recorded agent turn.
+func UsageEventsPath() (string, error) {
+	dir, err := UsageDir()
+	if err != nil {
+		return "", err
+	}
+	return filepath.Join(dir, "events.jsonl"), nil
+}
diff --git a/internal/handler/handler.go b/internal/handler/handler.go
index a862ca7..f6cc1bb 100644
--- a/internal/handler/handler.go
+++ b/internal/handler/handler.go
@@ -52,9 +52,28 @@ type AgentEventHandler interface {
 	RequestApproval(ctx context.Context, req ApprovalRequest) (ApprovalResponse, error)
 }
 
-// TokenUsage carries token usage info.
+// TokenUsage carries token usage info to the UI surfaces.
+//
+// TotalTokens is the LAST call's total — i.e. current context-window
+// occupancy, used to drive the context-usage bar. The remaining token counters
+// (Prompt/Completion/Cached/Reasoning/CacheWrite/CallCount) are CUMULATIVE for
+// the run's tracker, and CacheHitRate is the cumulative cached/prompt ratio.
+// CacheSupported is false when the provider never reported any cached tokens,
+// so the UI can show "—" instead of a misleading 0%.
+//
+// NOTE: the field order/types here must stay identical to WebTokenData
+// (internal/handler/web.go) so OnTokenUpdate's direct struct conversion keeps
+// compiling.
 type TokenUsage struct {
 	TotalTokens       int64
+	PromptTokens      int64
+	CompletionTokens  int64
+	CachedTokens      int64
+	ReasoningTokens   int64
+	CacheWriteTokens  int64
+	CallCount         int64
+	CacheHitRate      float64
+	CacheSupported    bool
 	ModelContextLimit int // 0 if unknown
 }
 
diff --git a/internal/handler/web.go b/internal/handler/web.go
index a62ee31..e6acd1c 100644
--- a/internal/handler/web.go
+++ b/internal/handler/web.go
@@ -224,10 +224,21 @@ type WebToolResultData struct {
 	ToolCallID    string `json:"tool_call_id,omitempty"`
 }
 
-// WebTokenData carries token usage.
+// WebTokenData carries token usage to the browser. Field order/types MUST match
+// handler.TokenUsage so OnTokenUpdate's WebTokenData(info) conversion compiles.
+// total_tokens is current context occupancy (last call); the rest are
+// cumulative for the session.
 type WebTokenData struct {
-	TotalTokens       int64 `json:"total_tokens"`
-	ModelContextLimit int   `json:"model_context_limit"`
+	TotalTokens       int64   `json:"total_tokens"`
+	PromptTokens      int64   `json:"prompt_tokens"`
+	CompletionTokens  int64   `json:"completion_tokens"`
+	CachedTokens      int64   `json:"cached_tokens"`
+	ReasoningTokens   int64   `json:"reasoning_tokens"`
+	CacheWriteTokens  int64   `json:"cache_write_tokens"`
+	CallCount         int64   `json:"call_count"`
+	CacheHitRate      float64 `json:"cache_hit_rate"`
+	CacheSupported    bool    `json:"cache_supported"`
+	ModelContextLimit int     `json:"model_context_limit"`
 }
 
 // WebSubagentData carries subagent lifecycle events.
diff --git a/internal/model/chatmodel.go b/internal/model/chatmodel.go
index 28d0bba..1f99e40 100644
--- a/internal/model/chatmodel.go
+++ b/internal/model/chatmodel.go
@@ -16,42 +16,89 @@ import (
 	"github.com/cnjack/jcode/internal/config"
 )
 
-// TokenUsage tracks token consumption across all API calls
+// TokenUsage tracks token consumption across all API calls.
+//
+// CachedTokens is the cache-READ portion of the prompt (tokens served from the
+// provider's KV cache). CacheWriteTokens is the cache-CREATION portion; it is
+// 0 today because the shared go-openai transport does not surface
+// cache_creation_input_tokens, and is kept as a forward-compatible field.
+// ReasoningTokens is the reasoning/thinking subset of the completion.
 type TokenUsage struct {
 	PromptTokens     int64
 	CompletionTokens int64
 	TotalTokens      int64
 	CachedTokens     int64
+	ReasoningTokens  int64
+	CacheWriteTokens int64
+	CallCount        int64 // number of API calls recorded (averages denominator)
 	LastTotalTokens  int64
 	// Per-call "last" values for tracing/observability.
 	lastPrompt     int64
 	lastCompletion int64
 	lastCached     int64
+	lastReasoning  int64
+	lastCacheWrite int64
 	byModel        map[string]int64
 	mu             sync.RWMutex
 }
 
-// TokenUsageDetail holds per-call token usage details for tracing/observability.
+// AddParams carries one API call's token usage. Using a struct keeps the
+// growing set of token categories from turning Add into a long positional list.
+type AddParams struct {
+	Prompt     int
+	Completion int
+	Total      int
+	Cached     int
+	Reasoning  int
+	CacheWrite int
+}
+
+// TokenUsageDetail holds a token usage snapshot for tracing/observability and
+// for JSON transport to the UI. Reasoning/cache-write/call-count carry
+// omitempty so per-call telemetry stays compact while cumulative snapshots
+// (GetFull) carry the full breakdown.
 type TokenUsageDetail struct {
 	PromptTokens     int `json:"prompt_tokens"`
 	CompletionTokens int `json:"completion_tokens"`
 	TotalTokens      int `json:"total_tokens"`
 	CachedTokens     int `json:"cached_tokens"`
+	ReasoningTokens  int `json:"reasoning_tokens,omitempty"`
+	CacheWriteTokens int `json:"cache_write_tokens,omitempty"`
+	CallCount        int `json:"call_count,omitempty"`
+}
+
+// Minus returns the per-field difference d-prev, used to derive the token delta
+// of a single agent run from cumulative snapshots.
+func (d TokenUsageDetail) Minus(prev TokenUsageDetail) TokenUsageDetail {
+	return TokenUsageDetail{
+		PromptTokens:     d.PromptTokens - prev.PromptTokens,
+		CompletionTokens: d.CompletionTokens - prev.CompletionTokens,
+		TotalTokens:      d.TotalTokens - prev.TotalTokens,
+		CachedTokens:     d.CachedTokens - prev.CachedTokens,
+		ReasoningTokens:  d.ReasoningTokens - prev.ReasoningTokens,
+		CacheWriteTokens: d.CacheWriteTokens - prev.CacheWriteTokens,
+		CallCount:        d.CallCount - prev.CallCount,
+	}
 }
 
 // TokenTracker is a global token usage tracker
 var TokenTracker = &TokenUsage{}
 
-// Add adds token usage to the tracker
-func (t *TokenUsage) Add(prompt, completion, total, cached int) {
-	atomic.AddInt64(&t.PromptTokens, int64(prompt))
-	atomic.AddInt64(&t.CompletionTokens, int64(completion))
-	atomic.AddInt64(&t.TotalTokens, int64(total))
-	atomic.AddInt64(&t.CachedTokens, int64(cached))
-	atomic.StoreInt64(&t.LastTotalTokens, int64(total))
-	atomic.StoreInt64(&t.lastPrompt, int64(prompt))
-	atomic.StoreInt64(&t.lastCompletion, int64(completion))
-	atomic.StoreInt64(&t.lastCached, int64(cached))
+// Add records one API call's token usage.
+func (t *TokenUsage) Add(p AddParams) {
+	atomic.AddInt64(&t.PromptTokens, int64(p.Prompt))
+	atomic.AddInt64(&t.CompletionTokens, int64(p.Completion))
+	atomic.AddInt64(&t.TotalTokens, int64(p.Total))
+	atomic.AddInt64(&t.CachedTokens, int64(p.Cached))
+	atomic.AddInt64(&t.ReasoningTokens, int64(p.Reasoning))
+	atomic.AddInt64(&t.CacheWriteTokens, int64(p.CacheWrite))
+	atomic.AddInt64(&t.CallCount, 1)
+	atomic.StoreInt64(&t.LastTotalTokens, int64(p.Total))
+	atomic.StoreInt64(&t.lastPrompt, int64(p.Prompt))
+	atomic.StoreInt64(&t.lastCompletion, int64(p.Completion))
+	atomic.StoreInt64(&t.lastCached, int64(p.Cached))
+	atomic.StoreInt64(&t.lastReasoning, int64(p.Reasoning))
+	atomic.StoreInt64(&t.lastCacheWrite, int64(p.CacheWrite))
 }
 
 // Get returns the current token usage
@@ -73,7 +120,48 @@ func (t *TokenUsage) GetLastDetail() *TokenUsageDetail {
 		CompletionTokens: int(atomic.LoadInt64(&t.lastCompletion)),
 		TotalTokens:      int(atomic.LoadInt64(&t.LastTotalTokens)),
 		CachedTokens:     int(atomic.LoadInt64(&t.lastCached)),
+		ReasoningTokens:  int(atomic.LoadInt64(&t.lastReasoning)),
+		CacheWriteTokens: int(atomic.LoadInt64(&t.lastCacheWrite)),
+	}
+}
+
+// GetFull returns a cumulative snapshot of all tracked token usage.
+func (t *TokenUsage) GetFull() TokenUsageDetail {
+	return TokenUsageDetail{
+		PromptTokens:     int(atomic.LoadInt64(&t.PromptTokens)),
+		CompletionTokens: int(atomic.LoadInt64(&t.CompletionTokens)),
+		TotalTokens:      int(atomic.LoadInt64(&t.TotalTokens)),
+		CachedTokens:     int(atomic.LoadInt64(&t.CachedTokens)),
+		ReasoningTokens:  int(atomic.LoadInt64(&t.ReasoningTokens)),
+		CacheWriteTokens: int(atomic.LoadInt64(&t.CacheWriteTokens)),
+		CallCount:        int(atomic.LoadInt64(&t.CallCount)),
+	}
+}
+
+// CacheHitRate returns the cumulative KV cache hit rate, defined as
+// cached / prompt — the fraction of prompt tokens served from the provider's
+// cache. Returns 0 when no prompt tokens have been recorded. The result is
+// clamped to [0,1] to stay robust against provider quirks.
+func (t *TokenUsage) CacheHitRate() float64 {
+	prompt := atomic.LoadInt64(&t.PromptTokens)
+	if prompt <= 0 {
+		return 0
 	}
+	r := float64(atomic.LoadInt64(&t.CachedTokens)) / float64(prompt)
+	switch {
+	case r < 0:
+		return 0
+	case r > 1:
+		return 1
+	default:
+		return r
+	}
+}
+
+// CacheObserved reports whether any cache-read tokens have been seen, used to
+// distinguish "cache hit rate is 0%" from "this provider never reports caching".
+func (t *TokenUsage) CacheObserved() bool {
+	return atomic.LoadInt64(&t.CachedTokens) > 0
 }
 
 // Reset resets the token tracker
@@ -82,6 +170,15 @@ func (t *TokenUsage) Reset() {
 	atomic.StoreInt64(&t.CompletionTokens, 0)
 	atomic.StoreInt64(&t.TotalTokens, 0)
 	atomic.StoreInt64(&t.CachedTokens, 0)
+	atomic.StoreInt64(&t.ReasoningTokens, 0)
+	atomic.StoreInt64(&t.CacheWriteTokens, 0)
+	atomic.StoreInt64(&t.CallCount, 0)
+	atomic.StoreInt64(&t.LastTotalTokens, 0)
+	atomic.StoreInt64(&t.lastPrompt, 0)
+	atomic.StoreInt64(&t.lastCompletion, 0)
+	atomic.StoreInt64(&t.lastCached, 0)
+	atomic.StoreInt64(&t.lastReasoning, 0)
+	atomic.StoreInt64(&t.lastCacheWrite, 0)
 	t.mu.Lock()
 	t.byModel = nil
 	t.mu.Unlock()
@@ -180,6 +277,53 @@ func (m *chatModel) WithTools(tools []*schema.ToolInfo) (einomodel.ToolCallingCh
 	return &chatModel{client: m.client, model: m.model, tools: oaiTools}, nil
 }
 
+// extractUsage maps a go-openai Usage onto AddParams. cache_creation tokens are
+// not exposed by go-openai's schema, so CacheWrite is always 0 here; reasoning
+// and cache-read are picked up from the *TokensDetails sub-objects when present.
+func extractUsage(u openai.Usage) AddParams {
+	p := AddParams{
+		Prompt:     u.PromptTokens,
+		Completion: u.CompletionTokens,
+		Total:      u.TotalTokens,
+	}
+	if u.PromptTokensDetails != nil {
+		p.Cached = u.PromptTokensDetails.CachedTokens
+	}
+	if u.CompletionTokensDetails != nil {
+		p.Reasoning = u.CompletionTokensDetails.ReasoningTokens
+	}
+	// Some providers (e.g. some GLM/OpenAI-compatible gateways) omit total_tokens
+	// and only return prompt/completion. Derive it so the context indicator works.
+	if p.Total == 0 {
+		p.Total = p.Prompt + p.Completion
+	}
+	return p
+}
+
+// hasUsage reports whether a Usage object carries any token counts, tolerating
+// providers that populate prompt/completion but leave total_tokens at 0.
+func hasUsage(u openai.Usage) bool {
+	return u.PromptTokens > 0 || u.CompletionTokens > 0 || u.TotalTokens > 0
+}
+
+// recordUsage feeds one API call's usage into both the global tracker and the
+// per-agent tracker on the context (when present), preserving the dual-tracker
+// pattern.
+func (m *chatModel) recordUsage(ctx context.Context, u openai.Usage) {
+	p := extractUsage(u)
+	TokenTracker.Add(p)
+	TokenTracker.AddByModel(m.model, p.Prompt, p.Completion, p.Total)
+	if local := TokenTrackerFromContext(ctx); local != nil {
+		local.Add(p)
+		local.AddByModel(m.model, p.Prompt, p.Completion, p.Total)
+	}
+	// Real-time UI refresh: fire after the trackers are updated so the callback
+	// reads the just-recorded usage.
+	if notify := UsageNotifierFromContext(ctx); notify != nil {
+		notify()
+	}
+}
+
 func (m *chatModel) Generate(ctx context.Context, input []*schema.Message, opts ...einomodel.Option) (*schema.Message, error) {
 	req := m.buildRequest(input, false, opts...)
 	config.Logger().Printf("[chatmodel] Generate start (model: %s)", m.model)
@@ -190,17 +334,9 @@ func (m *chatModel) Generate(ctx context.Context, input []*schema.Message, opts
 		return nil, err
 	}
 	// Track token usage
-	if resp.Usage.TotalTokens > 0 {
-		cached := 0
-		if resp.Usage.PromptTokensDetails != nil {
-			cached = resp.Usage.PromptTokensDetails.CachedTokens
-		}
-		TokenTracker.Add(resp.Usage.PromptTokens, resp.Usage.CompletionTokens, resp.Usage.TotalTokens, cached)
-		TokenTracker.AddByModel(m.model, resp.Usage.PromptTokens, resp.Usage.CompletionTokens, resp.Usage.TotalTokens)
-		if local := TokenTrackerFromContext(ctx); local != nil {
-			local.Add(resp.Usage.PromptTokens, resp.Usage.CompletionTokens, resp.Usage.TotalTokens, cached)
-			local.AddByModel(m.model, resp.Usage.PromptTokens, resp.Usage.CompletionTokens, resp.Usage.TotalTokens)
-		}
+	config.Logger().Printf("[chatmodel] Generate usage: prompt=%d completion=%d total=%d", resp.Usage.PromptTokens, resp.Usage.CompletionTokens, resp.Usage.TotalTokens)
+	if hasUsage(resp.Usage) {
+		m.recordUsage(ctx, resp.Usage)
 	}
 	if len(resp.Choices) == 0 {
 		return nil, fmt.Errorf("empty response from model")
@@ -229,10 +365,12 @@ func (m *chatModel) Stream(ctx context.Context, input []*schema.Message, opts ..
 		defer func() { _ = stream.Close() }()
 		chunkCount := 0
 		toolCallSeen := false
+		usageSeen := false
+		var lastUsage *openai.Usage
 		for {
 			resp, err := stream.Recv()
 			if err == io.EOF {
-				config.Logger().Printf("[chatmodel] Stream EOF after %d chunks, toolCallSeen=%v", chunkCount, toolCallSeen)
+				config.Logger().Printf("[chatmodel] Stream EOF after %d chunks, toolCallSeen=%v usageSeen=%v", chunkCount, toolCallSeen, usageSeen)
 				break
 			}
 			if err != nil {
@@ -241,18 +379,15 @@ func (m *chatModel) Stream(ctx context.Context, input []*schema.Message, opts ..
 				break
 			}
 			chunkCount++
-			// Track token usage from stream response
-			if resp.Usage != nil && resp.Usage.TotalTokens > 0 {
-				cached := 0
-				if resp.Usage.PromptTokensDetails != nil {
-					cached = resp.Usage.PromptTokensDetails.CachedTokens
-				}
-				TokenTracker.Add(resp.Usage.PromptTokens, resp.Usage.CompletionTokens, resp.Usage.TotalTokens, cached)
-				TokenTracker.AddByModel(m.model, resp.Usage.PromptTokens, resp.Usage.CompletionTokens, resp.Usage.TotalTokens)
-				if local := TokenTrackerFromContext(ctx); local != nil {
-					local.Add(resp.Usage.PromptTokens, resp.Usage.CompletionTokens, resp.Usage.TotalTokens, cached)
-					local.AddByModel(m.model, resp.Usage.PromptTokens, resp.Usage.CompletionTokens, resp.Usage.TotalTokens)
-				}
+			// Capture token usage from the stream. Some providers only send
+			// usage in a final chunk (requires stream_options.include_usage),
+			// and some omit total_tokens — hasUsage tolerates both. We record
+			// only the LAST usage once the stream ends, so providers that repeat
+			// (cumulative) usage per chunk aren't counted multiple times.
+			if resp.Usage != nil && hasUsage(*resp.Usage) {
+				u := *resp.Usage
+				lastUsage = &u
+				usageSeen = true
 			}
 			if len(resp.Choices) == 0 {
 				continue
@@ -272,6 +407,12 @@ func (m *chatModel) Stream(ctx context.Context, input []*schema.Message, opts ..
 			}
 			sw.Send(msg, nil)
 		}
+		// Record once at stream end so the per-call notifier fires a single
+		// token_update for this call.
+		if lastUsage != nil {
+			config.Logger().Printf("[chatmodel] Stream usage: prompt=%d completion=%d total=%d", lastUsage.PromptTokens, lastUsage.CompletionTokens, lastUsage.TotalTokens)
+			m.recordUsage(ctx, *lastUsage)
+		}
 	}()
 
 	return sr, nil
diff --git a/internal/model/token_ctx.go b/internal/model/token_ctx.go
index 3e0c9a5..0fef97c 100644
--- a/internal/model/token_ctx.go
+++ b/internal/model/token_ctx.go
@@ -15,3 +15,19 @@ func TokenTrackerFromContext(ctx context.Context) *TokenUsage {
 	v, _ := ctx.Value(tokenCtxKey{}).(*TokenUsage)
 	return v
 }
+
+type usageNotifierKey struct{}
+
+// WithUsageNotifier attaches a callback that chatModel.Generate/Stream invokes
+// after each API call's usage has been recorded. UIs use it to refresh the
+// token/context display in real time during a run, not just at turn end. The
+// model layer stays provider/UI-agnostic — it only fires the opaque callback.
+func WithUsageNotifier(ctx context.Context, fn func()) context.Context {
+	return context.WithValue(ctx, usageNotifierKey{}, fn)
+}
+
+// UsageNotifierFromContext retrieves the per-call usage notifier, if any.
+func UsageNotifierFromContext(ctx context.Context) func() {
+	fn, _ := ctx.Value(usageNotifierKey{}).(func())
+	return fn
+}
diff --git a/internal/model/token_usage_test.go b/internal/model/token_usage_test.go
new file mode 100644
index 0000000..a411309
--- /dev/null
+++ b/internal/model/token_usage_test.go
@@ -0,0 +1,87 @@
+package model
+
+import "testing"
+
+func TestTokenUsage_AddAndGetFull(t *testing.T) {
+	u := &TokenUsage{}
+	u.Add(AddParams{Prompt: 1000, Completion: 200, Total: 1200, Cached: 800, Reasoning: 50})
+	u.Add(AddParams{Prompt: 500, Completion: 100, Total: 600, Cached: 500, Reasoning: 10})
+
+	got := u.GetFull()
+	if got.PromptTokens != 1500 {
+		t.Errorf("PromptTokens = %d, want 1500", got.PromptTokens)
+	}
+	if got.CompletionTokens != 300 {
+		t.Errorf("CompletionTokens = %d, want 300", got.CompletionTokens)
+	}
+	if got.TotalTokens != 1800 {
+		t.Errorf("TotalTokens = %d, want 1800", got.TotalTokens)
+	}
+	if got.CachedTokens != 1300 {
+		t.Errorf("CachedTokens = %d, want 1300", got.CachedTokens)
+	}
+	if got.ReasoningTokens != 60 {
+		t.Errorf("ReasoningTokens = %d, want 60", got.ReasoningTokens)
+	}
+	if got.CallCount != 2 {
+		t.Errorf("CallCount = %d, want 2", got.CallCount)
+	}
+}
+
+func TestTokenUsage_CacheHitRate(t *testing.T) {
+	tests := []struct {
+		name   string
+		params []AddParams
+		want   float64
+		obs    bool
+	}{
+		{"no calls", nil, 0, false},
+		{"half cached", []AddParams{{Prompt: 1000, Cached: 500}}, 0.5, true},
+		{
+			"token weighted across calls",
+			[]AddParams{{Prompt: 1000, Cached: 900}, {Prompt: 1000, Cached: 100}},
+			0.5, true,
+		},
+		{"no cache reported", []AddParams{{Prompt: 1000, Cached: 0}}, 0, false},
+		{"clamp over one", []AddParams{{Prompt: 100, Cached: 250}}, 1, true},
+	}
+	for _, tc := range tests {
+		t.Run(tc.name, func(t *testing.T) {
+			u := &TokenUsage{}
+			for _, p := range tc.params {
+				u.Add(p)
+			}
+			if got := u.CacheHitRate(); got != tc.want {
+				t.Errorf("CacheHitRate() = %v, want %v", got, tc.want)
+			}
+			if got := u.CacheObserved(); got != tc.obs {
+				t.Errorf("CacheObserved() = %v, want %v", got, tc.obs)
+			}
+		})
+	}
+}
+
+func TestTokenUsageDetail_Minus(t *testing.T) {
+	cur := TokenUsageDetail{PromptTokens: 1500, CompletionTokens: 300, TotalTokens: 1800, CachedTokens: 1300, ReasoningTokens: 60, CallCount: 3}
+	prev := TokenUsageDetail{PromptTokens: 1000, CompletionTokens: 200, TotalTokens: 1200, CachedTokens: 800, ReasoningTokens: 50, CallCount: 2}
+	d := cur.Minus(prev)
+	if d.PromptTokens != 500 || d.CompletionTokens != 100 || d.TotalTokens != 600 || d.CachedTokens != 500 || d.ReasoningTokens != 10 || d.CallCount != 1 {
+		t.Errorf("Minus() = %+v, want deltas {500,100,600,500,10,1}", d)
+	}
+}
+
+func TestTokenUsage_Reset(t *testing.T) {
+	u := &TokenUsage{}
+	u.Add(AddParams{Prompt: 100, Completion: 20, Total: 120, Cached: 80, Reasoning: 5})
+	u.AddByModel("m", 100, 20, 120)
+	u.Reset()
+	if got := u.GetFull(); got.PromptTokens != 0 || got.CachedTokens != 0 || got.CallCount != 0 {
+		t.Errorf("after Reset GetFull() = %+v, want zero", got)
+	}
+	if u.GetByModel() != nil {
+		t.Errorf("after Reset GetByModel() should be nil")
+	}
+	if u.CacheObserved() {
+		t.Errorf("after Reset CacheObserved() should be false")
+	}
+}
diff --git a/internal/runner/runner.go b/internal/runner/runner.go
index ffbf911..69584cd 100644
--- a/internal/runner/runner.go
+++ b/internal/runner/runner.go
@@ -15,6 +15,7 @@ import (
 	"github.com/cnjack/jcode/internal/session"
 	"github.com/cnjack/jcode/internal/telemetry"
 	"github.com/cnjack/jcode/internal/tools"
+	"github.com/cnjack/jcode/internal/usage"
 )
 
 // Run executes the agent for a single turn, wrapping the response with a
@@ -37,6 +38,21 @@ func Run(
 	if tokenUsage != nil {
 		ctx = internalmodel.WithTokenTracker(ctx, tokenUsage)
 	}
+	// Snapshot cumulative usage so we can record this turn's delta on completion.
+	var startSnap internalmodel.TokenUsageDetail
+	if tokenUsage != nil {
+		startSnap = tokenUsage.GetFull()
+	}
+	// Resolve the context limit once (config + registry lookup) and reuse it for
+	// every live update below.
+	ctxLimit := modelContextLimit()
+	// Real-time token display: push a fresh snapshot after every LLM call (not
+	// just at turn end) so the UI's context indicator ticks up live during a run.
+	if tokenUsage != nil {
+		ctx = internalmodel.WithUsageNotifier(ctx, func() {
+			h.OnTokenUpdate(buildTokenUsage(tokenUsage, ctxLimit))
+		})
+	}
 	h.OnAgentStart()
 
 	resp, done := runInner(ctx, ag, messages, h, rec)
@@ -117,18 +133,16 @@ todoLoop:
 		}
 	}
 
-	// Send token usage update before signalling done.
-	var lastTotalTokens int64
-	if tokenUsage != nil {
-		lastTotalTokens = tokenUsage.GetLastTotal()
-	}
+	// Send a final token usage update before signalling done. Prefer the
+	// context-local tracker (per-agent) and fall back to the passed-in one.
+	tracker := tokenUsage
 	if local := internalmodel.TokenTrackerFromContext(ctx); local != nil {
-		lastTotalTokens = local.GetLastTotal()
+		tracker = local
 	}
-	h.OnTokenUpdate(handler.TokenUsage{
-		TotalTokens:       lastTotalTokens,
-		ModelContextLimit: modelContextLimit(),
-	})
+	h.OnTokenUpdate(buildTokenUsage(tracker, ctxLimit))
+
+	// Persist this turn's token delta to the global usage log for stats.
+	recordUsageTurn(tokenUsage, startSnap, rec)
 
 	h.OnAgentDone(nil)
 	return resp
@@ -321,6 +335,26 @@ func runInner(
 	return assistantText.String(), false
 }
 
+// buildTokenUsage snapshots a tracker into a handler.TokenUsage. TotalTokens is
+// the last call's total (current context occupancy); the rest are cumulative.
+// Safe to call from any goroutine (the tracker uses atomics).
+func buildTokenUsage(tracker *internalmodel.TokenUsage, ctxLimit int) handler.TokenUsage {
+	tu := handler.TokenUsage{ModelContextLimit: ctxLimit}
+	if tracker != nil {
+		full := tracker.GetFull()
+		tu.TotalTokens = tracker.GetLastTotal()
+		tu.PromptTokens = int64(full.PromptTokens)
+		tu.CompletionTokens = int64(full.CompletionTokens)
+		tu.CachedTokens = int64(full.CachedTokens)
+		tu.ReasoningTokens = int64(full.ReasoningTokens)
+		tu.CacheWriteTokens = int64(full.CacheWriteTokens)
+		tu.CallCount = int64(full.CallCount)
+		tu.CacheHitRate = tracker.CacheHitRate()
+		tu.CacheSupported = tracker.CacheObserved()
+	}
+	return tu
+}
+
 func modelContextLimit() int {
 	cfg, err := config.LoadConfig()
 	if err != nil {
@@ -330,3 +364,31 @@ func modelContextLimit() int {
 	registry := internalmodel.NewModelRegistryWithConfig(cfg)
 	return internalmodel.ResolveContextLimit(registry, cfg, provider, modelName)
 }
+
+// recordUsageTurn appends this turn's token delta (cumulative-now minus the
+// start-of-turn snapshot) to the global usage log. Best-effort: a nil tracker,
+// an empty delta, or a write error never affects the run.
+func recordUsageTurn(tracker *internalmodel.TokenUsage, start internalmodel.TokenUsageDetail, rec *session.Recorder) {
+	if tracker == nil {
+		return
+	}
+	delta := tracker.GetFull().Minus(start)
+	if delta.TotalTokens <= 0 && delta.PromptTokens <= 0 {
+		return
+	}
+	ev := usage.Event{
+		Prompt:     delta.PromptTokens,
+		Completion: delta.CompletionTokens,
+		Cached:     delta.CachedTokens,
+		Reasoning:  delta.ReasoningTokens,
+		CacheWrite: delta.CacheWriteTokens,
+		Total:      delta.TotalTokens,
+		Calls:      delta.CallCount,
+	}
+	if rec != nil {
+		ev.Session = rec.UUID()
+		ev.Project = rec.Project()
+		ev.Model = rec.Model()
+	}
+	usage.RecordEvent(ev)
+}
diff --git a/internal/session/session.go b/internal/session/session.go
index 634f1c2..79e9db4 100644
--- a/internal/session/session.go
+++ b/internal/session/session.go
@@ -164,6 +164,15 @@ func NewRecorder(project, provider, model string) (*Recorder, error) {
 // UUID returns the session identifier.
 func (r *Recorder) UUID() string { return r.uuid }
 
+// Project returns the workspace path this recorder is scoped to.
+func (r *Recorder) Project() string { return r.project }
+
+// Provider returns the provider the session was opened with.
+func (r *Recorder) Provider() string { return r.provider }
+
+// Model returns the model the session was opened with.
+func (r *Recorder) Model() string { return r.model }
+
 // ValidateSessionID checks that a session ID is safe for use as a filename.
 // It rejects empty IDs, path traversal sequences, and path separators.
 func ValidateSessionID(id string) error {
diff --git a/internal/team/manager.go b/internal/team/manager.go
index d005972..0f4a434 100644
--- a/internal/team/manager.go
+++ b/internal/team/manager.go
@@ -23,6 +23,7 @@ import (
 	internalmodel "github.com/cnjack/jcode/internal/model"
 	"github.com/cnjack/jcode/internal/session"
 	"github.com/cnjack/jcode/internal/telemetry"
+	"github.com/cnjack/jcode/internal/usage"
 )
 
 const (
@@ -55,6 +56,9 @@ type TeammateState struct {
 	AgentType   string
 	Permission  string
 	TokenUsage  *internalmodel.TokenUsage
+	// LastUsage snapshots the teammate's cumulative usage at the last global
+	// usage-log write, so each turn records only its delta.
+	LastUsage internalmodel.TokenUsageDetail
 }
 
 // ManagerDeps holds dependencies injected into the TeamManager.
@@ -750,6 +754,27 @@ func (m *Manager) runAgentTurn(ctx context.Context, state *TeammateState) (strin
 		})
 	}
 
+	// Roll this teammate's per-turn token delta into the global usage log under
+	// the leader's session so team work counts toward global stats.
+	if state.TokenUsage != nil && m.deps.LeaderSessionUUID != "" {
+		full := state.TokenUsage.GetFull()
+		delta := full.Minus(state.LastUsage)
+		state.LastUsage = full
+		if delta.TotalTokens > 0 {
+			usage.RecordEvent(usage.Event{
+				Session:    m.deps.LeaderSessionUUID,
+				Model:      state.Model,
+				Prompt:     delta.PromptTokens,
+				Completion: delta.CompletionTokens,
+				Cached:     delta.CachedTokens,
+				Reasoning:  delta.ReasoningTokens,
+				CacheWrite: delta.CacheWriteTokens,
+				Total:      delta.TotalTokens,
+				Calls:      delta.CallCount,
+			})
+		}
+	}
+
 	endTrace()
 	return result.String(), nil
 }
diff --git a/internal/telemetry/langfuse.go b/internal/telemetry/langfuse.go
index ca2560e..d0fc145 100644
--- a/internal/telemetry/langfuse.go
+++ b/internal/telemetry/langfuse.go
@@ -182,7 +182,8 @@ func (t *LangfuseTracer) buildMiddleware(useParentSpan bool) adk.AgentMiddleware
 						TotalTokens:      d.TotalTokens,
 					}
 					metadata = map[string]string{
-						"cached_tokens": fmt.Sprintf("%d", d.CachedTokens),
+						"cached_tokens":    fmt.Sprintf("%d", d.CachedTokens),
+						"reasoning_tokens": fmt.Sprintf("%d", d.ReasoningTokens),
 					}
 				}
 			}
diff --git a/internal/tools/subagent.go b/internal/tools/subagent.go
index 285a721..43ec159 100644
--- a/internal/tools/subagent.go
+++ b/internal/tools/subagent.go
@@ -18,6 +18,7 @@ import (
 	internalmodel "github.com/cnjack/jcode/internal/model"
 	"github.com/cnjack/jcode/internal/session"
 	"github.com/cnjack/jcode/internal/telemetry"
+	"github.com/cnjack/jcode/internal/usage"
 )
 
 const (
@@ -344,6 +345,27 @@ func (s *subagentTool) runSubagent(ctx context.Context, ag *adk.ChatModelAgent,
 		}
 	}
 
+	// Roll this subagent's tokens into the global usage log under the leader's
+	// session so subagent-heavy work isn't undercounted. The tracker is fresh
+	// per run, so its cumulative snapshot IS this run's delta.
+	if s.deps.Recorder != nil {
+		d := tokenUsage.GetFull()
+		if d.TotalTokens > 0 {
+			usage.RecordEvent(usage.Event{
+				Session:    s.deps.Recorder.UUID(),
+				Project:    s.deps.Recorder.Project(),
+				Model:      s.deps.Recorder.Model(),
+				Prompt:     d.PromptTokens,
+				Completion: d.CompletionTokens,
+				Cached:     d.CachedTokens,
+				Reasoning:  d.ReasoningTokens,
+				CacheWrite: d.CacheWriteTokens,
+				Total:      d.TotalTokens,
+				Calls:      d.CallCount,
+			})
+		}
+	}
+
 	return assistantText.String()
 }
 
diff --git a/internal/usage/estimate.go b/internal/usage/estimate.go
new file mode 100644
index 0000000..fa22c92
--- /dev/null
+++ b/internal/usage/estimate.go
@@ -0,0 +1,38 @@
+package usage
+
+// Token estimation for the per-task context-capacity breakdown. There is no
+// universal tokenizer across providers (GLM, Anthropic, OpenAI all differ), and
+// bundling a tokenizer is a heavy dependency for what is only a relative
+// breakdown. ~4 bytes/token is the well-known rough average for English+code;
+// the UI labels these numbers as estimates. Consistency across buckets matters
+// more than absolute accuracy here.
+
+// EstimateBytes approximates the token count of a byte length.
+func EstimateBytes(n int) int {
+	if n <= 0 {
+		return 0
+	}
+	return (n + 3) / 4
+}
+
+// Estimate approximates the token count of a string.
+func Estimate(s string) int { return EstimateBytes(len(s)) }
+
+// ContextBreakdown partitions a context window into the categories that make it
+// up. The four static buckets (system prompt / tools / MCP tools / skills) are
+// computed from the live agent assembly; MessagesTokens and ContextLimit are
+// filled in at query time.
+type ContextBreakdown struct {
+	SystemPromptTokens int `json:"system_prompt_tokens"`
+	SystemToolsTokens  int `json:"system_tools_tokens"`
+	MCPToolsTokens     int `json:"mcp_tools_tokens"`
+	SkillsTokens       int `json:"skills_tokens"`
+	MessagesTokens     int `json:"messages_tokens"`
+	ContextLimit       int `json:"context_limit"`
+}
+
+// StaticTotal is the sum of the four assembly-time buckets (everything except
+// the conversation messages).
+func (b ContextBreakdown) StaticTotal() int {
+	return b.SystemPromptTokens + b.SystemToolsTokens + b.MCPToolsTokens + b.SkillsTokens
+}
diff --git a/internal/usage/event.go b/internal/usage/event.go
new file mode 100644
index 0000000..36799f4
--- /dev/null
+++ b/internal/usage/event.go
@@ -0,0 +1,138 @@
+// Package usage records and aggregates token-usage statistics across all jcode
+// surfaces (TUI, web, ACP). It uses an append-only JSON-lines event log
+// (~/.jcode/usage/events.jsonl), one line per agent turn. Append-only writes
+// are atomic for small records under O_APPEND, so multiple jcode processes can
+// record concurrently without a read-modify-write race; all derived metrics are
+// computed at read time by Aggregate.
+package usage
+
+import (
+	"bufio"
+	"bytes"
+	"encoding/json"
+	"os"
+	"path/filepath"
+	"sync"
+	"time"
+
+	"github.com/cnjack/jcode/internal/config"
+)
+
+// Event is a single recorded agent turn's token usage. Token fields are the
+// per-turn delta (not cumulative).
+type Event struct {
+	TS         int64  `json:"ts"`   // unix seconds
+	Date       string `json:"date"` // YYYY-MM-DD, local time
+	Project    string `json:"project,omitempty"`
+	Session    string `json:"session,omitempty"` // session UUID
+	Model      string `json:"model,omitempty"`
+	Prompt     int    `json:"prompt"`
+	Completion int    `json:"completion"`
+	Cached     int    `json:"cached"`
+	Reasoning  int    `json:"reasoning,omitempty"`
+	CacheWrite int    `json:"cache_write,omitempty"`
+	Total      int    `json:"total"`
+	Calls      int    `json:"calls,omitempty"` // API calls in this turn
+}
+
+// RecordEvent stamps ev with the current time and appends it to the default
+// store. Callers fill in the session/project/model + token deltas; TS/Date are
+// set here. Best-effort: errors are swallowed so stats never break a run.
+func RecordEvent(ev Event) {
+	ts := time.Now()
+	ev.TS = ts.Unix()
+	ev.Date = ts.Format(dateLayout)
+	_ = Default().Record(ev)
+}
+
+// Store is an append-only event-log writer/reader.
+type Store struct {
+	path string
+	mu   sync.Mutex // serialises in-process appends
+}
+
+// NewStore returns a Store backed by the given file path.
+func NewStore(path string) *Store { return &Store{path: path} }
+
+var (
+	defaultStore *Store
+	defaultOnce  sync.Once
+)
+
+// Default returns the process-wide store bound to ~/.jcode/usage/events.jsonl.
+// If the path cannot be resolved, the returned store is a no-op.
+func Default() *Store {
+	defaultOnce.Do(func() {
+		path, err := config.UsageEventsPath()
+		if err != nil {
+			path = ""
+		}
+		defaultStore = &Store{path: path}
+	})
+	return defaultStore
+}
+
+// Record appends one event. Turns with no token usage are dropped. A nil or
+// pathless store is a no-op so callers never need to guard.
+func (s *Store) Record(ev Event) error {
+	if s == nil || s.path == "" {
+		return nil
+	}
+	if ev.Total <= 0 && ev.Prompt <= 0 && ev.Completion <= 0 {
+		return nil
+	}
+	line, err := json.Marshal(ev)
+	if err != nil {
+		return err
+	}
+	line = append(line, '\n')
+
+	s.mu.Lock()
+	defer s.mu.Unlock()
+	if err := os.MkdirAll(filepath.Dir(s.path), 0o755); err != nil {
+		return err
+	}
+	f, err := os.OpenFile(s.path, os.O_APPEND|os.O_CREATE|os.O_WRONLY, 0o644)
+	if err != nil {
+		return err
+	}
+	defer func() { _ = f.Close() }()
+	_, err = f.Write(line)
+	return err
+}
+
+// Load reads all events with Date >= since (YYYY-MM-DD). An empty since loads
+// everything. A missing log file yields an empty slice, not an error.
+// Malformed lines are skipped so a single bad write can't break stats.
+func (s *Store) Load(since string) ([]Event, error) {
+	if s == nil || s.path == "" {
+		return nil, nil
+	}
+	f, err := os.Open(s.path)
+	if err != nil {
+		if os.IsNotExist(err) {
+			return nil, nil
+		}
+		return nil, err
+	}
+	defer func() { _ = f.Close() }()
+
+	var out []Event
+	sc := bufio.NewScanner(f)
+	sc.Buffer(make([]byte, 0, 64*1024), 4*1024*1024)
+	for sc.Scan() {
+		line := bytes.TrimSpace(sc.Bytes())
+		if len(line) == 0 {
+			continue
+		}
+		var ev Event
+		if json.Unmarshal(line, &ev) != nil {
+			continue
+		}
+		if since != "" && ev.Date < since {
+			continue
+		}
+		out = append(out, ev)
+	}
+	return out, sc.Err()
+}
diff --git a/internal/usage/stats.go b/internal/usage/stats.go
new file mode 100644
index 0000000..276cf37
--- /dev/null
+++ b/internal/usage/stats.go
@@ -0,0 +1,197 @@
+package usage
+
+import (
+	"sort"
+	"time"
+)
+
+// dateLayout is the canonical local date format used for event bucketing.
+const dateLayout = "2006-01-02"
+
+// Today returns the current local date as YYYY-MM-DD.
+func Today() string { return time.Now().Format(dateLayout) }
+
+// Totals holds cumulative token counters over the aggregated window.
+type Totals struct {
+	Total      int64 `json:"total"`
+	Prompt     int64 `json:"prompt"`
+	Completion int64 `json:"completion"`
+	Cached     int64 `json:"cached"`
+	Reasoning  int64 `json:"reasoning"`
+	CacheWrite int64 `json:"cache_write"`
+	Calls      int64 `json:"calls"`
+	Turns      int64 `json:"turns"` // number of recorded events
+}
+
+// DayBucket is one day's rolled-up usage.
+type DayBucket struct {
+	Date   string `json:"date"`
+	Tokens int64  `json:"tokens"`
+	Turns  int64  `json:"turns"` // recorded turns ("轮") that day
+	Calls  int64  `json:"calls"`
+}
+
+// Share is a labelled token total (per-model or per-project).
+type Share struct {
+	Name   string  `json:"name"`
+	Tokens int64   `json:"tokens"`
+	Share  float64 `json:"share"` // fraction of the grand total, 0-1
+}
+
+// Aggregated is the full derived view over a set of events.
+type Aggregated struct {
+	Totals         Totals
+	ActiveDays     int
+	CurrentStreak  int
+	LongestStreak  int
+	MostUsedModel  string
+	CacheHitRate   float64 // cached / prompt, clamped to [0,1]
+	CacheSupported bool
+	Days           map[string]*DayBucket
+	ByModel        []Share
+	ByProject      []Share
+}
+
+// Aggregate reduces raw events into derived statistics. `today` (YYYY-MM-DD,
+// local) anchors the current-streak computation so the function stays pure and
+// testable.
+func Aggregate(events []Event, today string) Aggregated {
+	agg := Aggregated{Days: make(map[string]*DayBucket)}
+	byModel := map[string]int64{}
+	byProject := map[string]int64{}
+
+	for _, ev := range events {
+		agg.Totals.Total += int64(ev.Total)
+		agg.Totals.Prompt += int64(ev.Prompt)
+		agg.Totals.Completion += int64(ev.Completion)
+		agg.Totals.Cached += int64(ev.Cached)
+		agg.Totals.Reasoning += int64(ev.Reasoning)
+		agg.Totals.CacheWrite += int64(ev.CacheWrite)
+		agg.Totals.Calls += int64(ev.Calls)
+		agg.Totals.Turns++
+
+		d := agg.Days[ev.Date]
+		if d == nil {
+			d = &DayBucket{Date: ev.Date}
+			agg.Days[ev.Date] = d
+		}
+		d.Tokens += int64(ev.Total)
+		d.Turns++
+		d.Calls += int64(ev.Calls)
+
+		if ev.Model != "" {
+			byModel[ev.Model] += int64(ev.Total)
+		}
+		if ev.Project != "" {
+			byProject[ev.Project] += int64(ev.Total)
+		}
+	}
+
+	agg.ActiveDays = len(agg.Days)
+	agg.CurrentStreak = currentStreak(agg.Days, today)
+	agg.LongestStreak = longestStreak(agg.Days)
+	agg.CacheSupported = agg.Totals.Cached > 0
+	if agg.Totals.Prompt > 0 {
+		r := float64(agg.Totals.Cached) / float64(agg.Totals.Prompt)
+		switch {
+		case r < 0:
+			r = 0
+		case r > 1:
+			r = 1
+		}
+		agg.CacheHitRate = r
+	}
+	agg.ByModel = toShares(byModel, agg.Totals.Total)
+	agg.ByProject = toShares(byProject, agg.Totals.Total)
+	if len(agg.ByModel) > 0 {
+		agg.MostUsedModel = agg.ByModel[0].Name
+	}
+	return agg
+}
+
+// Trend returns the day buckets in ascending date order.
+func (a Aggregated) Trend() []DayBucket {
+	out := make([]DayBucket, 0, len(a.Days))
+	for _, d := range a.Days {
+		out = append(out, *d)
+	}
+	sort.Slice(out, func(i, j int) bool { return out[i].Date < out[j].Date })
+	return out
+}
+
+// toShares sorts a label→tokens map into descending Shares.
+func toShares(m map[string]int64, grand int64) []Share {
+	out := make([]Share, 0, len(m))
+	for name, tok := range m {
+		s := Share{Name: name, Tokens: tok}
+		if grand > 0 {
+			s.Share = float64(tok) / float64(grand)
+		}
+		out = append(out, s)
+	}
+	sort.Slice(out, func(i, j int) bool {
+		if out[i].Tokens != out[j].Tokens {
+			return out[i].Tokens > out[j].Tokens
+		}
+		return out[i].Name < out[j].Name
+	})
+	return out
+}
+
+// currentStreak counts consecutive active days ending at `today`, or at
+// yesterday if today has no activity yet (so a streak isn't considered broken
+// before the user has worked today). Returns 0 if the most recent activity is
+// older than yesterday.
+func currentStreak(days map[string]*DayBucket, today string) int {
+	if len(days) == 0 {
+		return 0
+	}
+	cur, err := time.Parse(dateLayout, today)
+	if err != nil {
+		return 0
+	}
+	if !active(days, cur) {
+		cur = cur.AddDate(0, 0, -1)
+	}
+	streak := 0
+	for active(days, cur) {
+		streak++
+		cur = cur.AddDate(0, 0, -1)
+	}
+	return streak
+}
+
+// longestStreak finds the longest run of consecutive calendar days with
+// activity.
+func longestStreak(days map[string]*DayBucket) int {
+	if len(days) == 0 {
+		return 0
+	}
+	dates := make([]time.Time, 0, len(days))
+	for d := range days {
+		t, err := time.Parse(dateLayout, d)
+		if err != nil {
+			continue
+		}
+		dates = append(dates, t)
+	}
+	sort.Slice(dates, func(i, j int) bool { return dates[i].Before(dates[j]) })
+
+	best, run := 1, 1
+	for i := 1; i < len(dates); i++ {
+		if dates[i].Equal(dates[i-1].AddDate(0, 0, 1)) {
+			run++
+		} else {
+			run = 1
+		}
+		if run > best {
+			best = run
+		}
+	}
+	return best
+}
+
+func active(days map[string]*DayBucket, t time.Time) bool {
+	d, ok := days[t.Format(dateLayout)]
+	return ok && d.Tokens > 0
+}
diff --git a/internal/usage/usage_test.go b/internal/usage/usage_test.go
new file mode 100644
index 0000000..060c470
--- /dev/null
+++ b/internal/usage/usage_test.go
@@ -0,0 +1,169 @@
+package usage
+
+import (
+	"path/filepath"
+	"testing"
+)
+
+func ev(date, model, project string, total, prompt, cached int) Event {
+	return Event{Date: date, Model: model, Project: project, Total: total, Prompt: prompt, Cached: cached, Completion: total - prompt, Calls: 1}
+}
+
+func TestStore_RecordAndLoad(t *testing.T) {
+	path := filepath.Join(t.TempDir(), "nested", "events.jsonl")
+	s := NewStore(path)
+
+	if err := s.Record(ev("2026-06-20", "glm-5.2", "/p", 100, 80, 60)); err != nil {
+		t.Fatalf("Record: %v", err)
+	}
+	if err := s.Record(ev("2026-06-21", "glm-5.2", "/p", 200, 150, 120)); err != nil {
+		t.Fatalf("Record: %v", err)
+	}
+	// Empty turn is dropped.
+	if err := s.Record(Event{Date: "2026-06-21"}); err != nil {
+		t.Fatalf("Record empty: %v", err)
+	}
+
+	all, err := s.Load("")
+	if err != nil {
+		t.Fatalf("Load: %v", err)
+	}
+	if len(all) != 2 {
+		t.Fatalf("Load() returned %d events, want 2 (empty turn should be dropped)", len(all))
+	}
+
+	since, err := s.Load("2026-06-21")
+	if err != nil {
+		t.Fatalf("Load since: %v", err)
+	}
+	if len(since) != 1 || since[0].Total != 200 {
+		t.Fatalf("Load(since) = %+v, want 1 event with total 200", since)
+	}
+}
+
+func TestStore_LoadMissingFile(t *testing.T) {
+	s := NewStore(filepath.Join(t.TempDir(), "nope.jsonl"))
+	got, err := s.Load("")
+	if err != nil || got != nil {
+		t.Fatalf("Load missing = (%v, %v), want (nil, nil)", got, err)
+	}
+}
+
+func TestAggregate_Totals(t *testing.T) {
+	events := []Event{
+		ev("2026-06-20", "glm-5.2", "/a", 1000, 800, 600),
+		ev("2026-06-21", "glm-5.2", "/a", 2000, 1600, 1400),
+		ev("2026-06-21", "claude", "/b", 500, 400, 0),
+	}
+	a := Aggregate(events, "2026-06-21")
+
+	if a.Totals.Total != 3500 {
+		t.Errorf("Total = %d, want 3500", a.Totals.Total)
+	}
+	if a.Totals.Turns != 3 {
+		t.Errorf("Turns = %d, want 3", a.Totals.Turns)
+	}
+	if a.ActiveDays != 2 {
+		t.Errorf("ActiveDays = %d, want 2", a.ActiveDays)
+	}
+	// cached/prompt = (600+1400+0)/(800+1600+400) = 2000/2800
+	if got := a.CacheHitRate; got < 0.714 || got > 0.715 {
+		t.Errorf("CacheHitRate = %v, want ~0.7143", got)
+	}
+	if !a.CacheSupported {
+		t.Error("CacheSupported = false, want true")
+	}
+	// glm-5.2 = 3000, claude = 500 → glm most used.
+	if a.MostUsedModel != "glm-5.2" {
+		t.Errorf("MostUsedModel = %q, want glm-5.2", a.MostUsedModel)
+	}
+	if len(a.ByModel) != 2 || a.ByModel[0].Name != "glm-5.2" || a.ByModel[0].Tokens != 3000 {
+		t.Errorf("ByModel = %+v, want glm-5.2 first with 3000", a.ByModel)
+	}
+	if len(a.ByProject) != 2 || a.ByProject[0].Name != "/a" || a.ByProject[0].Tokens != 3000 {
+		t.Errorf("ByProject = %+v, want /a first with 3000", a.ByProject)
+	}
+	// Day buckets.
+	if d := a.Days["2026-06-21"]; d == nil || d.Tokens != 2500 || d.Turns != 2 {
+		t.Errorf("Days[2026-06-21] = %+v, want tokens 2500 turns 2", d)
+	}
+}
+
+func TestAggregate_Streaks(t *testing.T) {
+	tests := []struct {
+		name       string
+		dates      []string
+		today      string
+		wantCur    int
+		wantLong   int
+		wantActive int
+	}{
+		{"empty", nil, "2026-06-21", 0, 0, 0},
+		{"single today", []string{"2026-06-21"}, "2026-06-21", 1, 1, 1},
+		{
+			"three in a row ending today",
+			[]string{"2026-06-19", "2026-06-20", "2026-06-21"},
+			"2026-06-21", 3, 3, 3,
+		},
+		{
+			"ends yesterday, today empty still counts",
+			[]string{"2026-06-19", "2026-06-20"},
+			"2026-06-21", 2, 2, 2,
+		},
+		{
+			"gap breaks current streak",
+			[]string{"2026-06-10", "2026-06-20", "2026-06-21"},
+			"2026-06-21", 2, 2, 3,
+		},
+		{
+			"stale: last activity 3 days ago",
+			[]string{"2026-06-17", "2026-06-18"},
+			"2026-06-21", 0, 2, 2,
+		},
+		{
+			"longest in the middle",
+			[]string{"2026-06-01", "2026-06-02", "2026-06-03", "2026-06-10"},
+			"2026-06-21", 0, 3, 4,
+		},
+	}
+	for _, tc := range tests {
+		t.Run(tc.name, func(t *testing.T) {
+			var events []Event
+			for _, d := range tc.dates {
+				events = append(events, ev(d, "m", "/p", 100, 80, 40))
+			}
+			a := Aggregate(events, tc.today)
+			if a.CurrentStreak != tc.wantCur {
+				t.Errorf("CurrentStreak = %d, want %d", a.CurrentStreak, tc.wantCur)
+			}
+			if a.LongestStreak != tc.wantLong {
+				t.Errorf("LongestStreak = %d, want %d", a.LongestStreak, tc.wantLong)
+			}
+			if a.ActiveDays != tc.wantActive {
+				t.Errorf("ActiveDays = %d, want %d", a.ActiveDays, tc.wantActive)
+			}
+		})
+	}
+}
+
+func TestAggregate_NoCacheSupport(t *testing.T) {
+	a := Aggregate([]Event{ev("2026-06-21", "m", "/p", 100, 80, 0)}, "2026-06-21")
+	if a.CacheSupported {
+		t.Error("CacheSupported = true, want false when no cached tokens seen")
+	}
+	if a.CacheHitRate != 0 {
+		t.Errorf("CacheHitRate = %v, want 0", a.CacheHitRate)
+	}
+}
+
+func TestAggregate_Trend(t *testing.T) {
+	a := Aggregate([]Event{
+		ev("2026-06-21", "m", "/p", 100, 80, 40),
+		ev("2026-06-19", "m", "/p", 100, 80, 40),
+		ev("2026-06-20", "m", "/p", 100, 80, 40),
+	}, "2026-06-21")
+	trend := a.Trend()
+	if len(trend) != 3 || trend[0].Date != "2026-06-19" || trend[2].Date != "2026-06-21" {
+		t.Errorf("Trend() not ascending: %+v", trend)
+	}
+}
diff --git a/internal/web/server.go b/internal/web/server.go
index d40d18f..b63ac1e 100644
--- a/internal/web/server.go
+++ b/internal/web/server.go
@@ -34,6 +34,7 @@ import (
 	"github.com/cnjack/jcode/internal/skills"
 	"github.com/cnjack/jcode/internal/telemetry"
 	"github.com/cnjack/jcode/internal/tools"
+	"github.com/cnjack/jcode/internal/usage"
 	utils "github.com/cnjack/jcode/internal/util"
 )
 
@@ -129,6 +130,13 @@ type Server struct {
 	// tokenUsage tracks per-call token totals for the agent runs, used for
 	// usage display (goal status, token updates).
 	tokenUsage *model.TokenUsage
+
+	// usageStore backs the global usage-statistics endpoint. nil falls back to
+	// usage.Default(); tests inject a temp-dir store.
+	usageStore *usage.Store
+
+	// breakdownFn computes the live context-window breakdown for the active task.
+	breakdownFn func() usage.ContextBreakdown
 }
 
 // ServerConfig holds the configuration for creating a new Server.
@@ -161,6 +169,7 @@ type ServerConfig struct {
 	EventHandler       handler.AgentEventHandler                                             // optional: handler for runner (e.g. NotifyingHandler)
 	NeedsSetup         bool                                                                  // true when no providers are configured (setup mode)
 	TokenUsage         *model.TokenUsage                                                     // optional: shared token tracker (created when nil)
+	ContextBreakdownFn func() usage.ContextBreakdown                                         // optional: live per-task context breakdown
 }
 
 // NewServer creates a new web server.
@@ -206,6 +215,7 @@ func NewServer(cfg *ServerConfig) *Server {
 		eventHandler:   eh,
 		needsSetup:     cfg.NeedsSetup,
 		tokenUsage:     cfg.TokenUsage,
+		breakdownFn:    cfg.ContextBreakdownFn,
 	}
 	if s.tokenUsage == nil {
 		s.tokenUsage = &model.TokenUsage{}
@@ -296,6 +306,8 @@ func (s *Server) Start(ctx context.Context) error {
 	mux.HandleFunc("POST /api/git/checkout", s.handleGitCheckout)
 	mux.HandleFunc("GET /api/tasks", s.handleListAllTasks)
 	mux.HandleFunc("PATCH /api/tasks/{id}", s.handleUpdateTask)
+	mux.HandleFunc("GET /api/usage/stats", s.handleUsageStats)
+	mux.HandleFunc("GET /api/tasks/{id}/stats", s.handleTaskStats)
 	mux.HandleFunc("GET /api/models", s.handleListModels)
 	mux.HandleFunc("POST /api/model", s.handleSwitchModel)
 	mux.HandleFunc("POST /api/mode", s.handleSwitchMode)
@@ -466,6 +478,7 @@ func (s *Server) handleHealth(w http.ResponseWriter, r *http.Request) {
 }
 
 func (s *Server) handleStatus(w http.ResponseWriter, r *http.Request) {
+	full := s.tokenUsage.GetFull()
 	writeJSON(w, http.StatusOK, map[string]any{
 		"running":    s.running.Load(),
 		"ws_clients": s.wsBroker.ClientCount(),
@@ -473,9 +486,33 @@ func (s *Server) handleStatus(w http.ResponseWriter, r *http.Request) {
 		"provider":   s.providerName,
 		"model":      s.modelName,
 		"mode":       s.mode,
+		// Live token snapshot so a client reconnecting between turns can render
+		// the context bar + cache hit rate without waiting for the next
+		// token_update WS event. total_tokens = current context occupancy.
+		"token": map[string]any{
+			"total_tokens":        s.tokenUsage.GetLastTotal(),
+			"prompt_tokens":       full.PromptTokens,
+			"completion_tokens":   full.CompletionTokens,
+			"cached_tokens":       full.CachedTokens,
+			"reasoning_tokens":    full.ReasoningTokens,
+			"cache_write_tokens":  full.CacheWriteTokens,
+			"call_count":          full.CallCount,
+			"cache_hit_rate":      s.tokenUsage.CacheHitRate(),
+			"cache_supported":     s.tokenUsage.CacheObserved(),
+			"model_context_limit": s.currentModelContextLimit(),
+		},
 	})
 }
 
+// currentModelContextLimit resolves the context window of the currently
+// selected model, or 0 if unknown.
+func (s *Server) currentModelContextLimit() int {
+	if s.registry == nil || s.cfg == nil {
+		return 0
+	}
+	return model.ResolveContextLimit(s.registry, s.cfg, s.providerName, s.modelName)
+}
+
 // handleWorkspace returns lightweight git workspace info (branch + dirty) for
 // the current project so the web UI can show the real branch name. Diff stats
 // are fetched separately via /api/diff. Empty branch = not a git repo.
diff --git a/internal/web/usage.go b/internal/web/usage.go
new file mode 100644
index 0000000..9af8b18
--- /dev/null
+++ b/internal/web/usage.go
@@ -0,0 +1,177 @@
+package web
+
+import (
+	"net/http"
+	"strconv"
+	"time"
+
+	"github.com/cnjack/jcode/internal/config"
+	"github.com/cnjack/jcode/internal/session"
+	"github.com/cnjack/jcode/internal/usage"
+)
+
+// usageHeatmapDays is the fixed lookback for the activity heatmap and streak
+// computation, independent of the (smaller) totals window.
+const usageHeatmapDays = 365
+
+// handleUsageStats returns aggregated global usage statistics. The ?days=N
+// query (default 30, capped at the heatmap window) scopes the totals,
+// per-model/project breakdowns and the daily trend; the heatmap and streaks
+// always span the full lookback so they read consistently across range toggles.
+func (s *Server) handleUsageStats(w http.ResponseWriter, r *http.Request) {
+	days := 30
+	if v := r.URL.Query().Get("days"); v != "" {
+		if n, err := strconv.Atoi(v); err == nil && n > 0 && n <= usageHeatmapDays {
+			days = n
+		}
+	}
+
+	today := usage.Today()
+	now := time.Now()
+	heatSince := now.AddDate(0, 0, -(usageHeatmapDays - 1)).Format("2006-01-02")
+	windowSince := now.AddDate(0, 0, -(days - 1)).Format("2006-01-02")
+
+	store := s.usageStore
+	if store == nil {
+		store = usage.Default()
+	}
+	events, err := store.Load(heatSince)
+	if err != nil {
+		config.Logger().Printf("[usage] load failed: %v", err)
+		events = nil
+	}
+	full := usage.Aggregate(events, today) // heatmap + streaks over the full window
+
+	windowEvents := make([]usage.Event, 0, len(events))
+	for _, ev := range events {
+		if ev.Date >= windowSince {
+			windowEvents = append(windowEvents, ev)
+		}
+	}
+	win := usage.Aggregate(windowEvents, today) // totals scoped to the selected range
+
+	resp := map[string]any{
+		"range_days": days,
+		"totals": map[string]any{
+			"total_tokens":      win.Totals.Total,
+			"prompt_tokens":     win.Totals.Prompt,
+			"completion_tokens": win.Totals.Completion,
+			"cached_tokens":     win.Totals.Cached,
+			"reasoning_tokens":  win.Totals.Reasoning,
+			"calls":             win.Totals.Calls,
+			"turns":             win.Totals.Turns,
+			"sessions":          countSessions(windowSince),
+		},
+		"active_days":     win.ActiveDays,
+		"current_streak":  full.CurrentStreak,
+		"longest_streak":  full.LongestStreak,
+		"most_used_model": win.MostUsedModel,
+		"cache_hit_rate":  win.CacheHitRate,
+		"cache_supported": win.CacheSupported,
+		"heatmap":         full.Trend(),
+		"daily_trend":     win.Trend(),
+		"by_model":        win.ByModel,
+		"by_project":      win.ByProject,
+	}
+	writeJSON(w, http.StatusOK, resp)
+}
+
+// handleTaskStats returns per-task statistics. For the ACTIVE task (the current
+// recorder's session) it returns a live context-window breakdown plus the live
+// cache hit rate. For any other (historical) task it returns a token rollup +
+// aggregate hit rate derived from the event log, with is_active=false and no
+// breakdown (tool-schema sizes weren't persisted, so a breakdown isn't
+// meaningful after the fact).
+func (s *Server) handleTaskStats(w http.ResponseWriter, r *http.Request) {
+	id := r.PathValue("id")
+
+	s.mu.RLock()
+	activeUUID := ""
+	if s.recorder != nil {
+		activeUUID = s.recorder.UUID()
+	}
+	s.mu.RUnlock()
+
+	resp := map[string]any{"uuid": id}
+
+	if id != "" && id == activeUUID {
+		full := s.tokenUsage.GetFull()
+		last := s.tokenUsage.GetLastDetail()
+
+		var bd usage.ContextBreakdown
+		if s.breakdownFn != nil {
+			bd = s.breakdownFn()
+		}
+		bd.ContextLimit = s.currentModelContextLimit()
+		// Messages occupy whatever the last prompt held beyond the static
+		// assembly (system prompt + tools + MCP + skills).
+		if msg := last.PromptTokens - bd.StaticTotal(); msg > 0 {
+			bd.MessagesTokens = msg
+		}
+
+		resp["is_active"] = true
+		resp["context"] = bd
+		resp["cache_hit_rate"] = s.tokenUsage.CacheHitRate()
+		resp["cache_supported"] = s.tokenUsage.CacheObserved()
+		resp["tokens"] = map[string]any{
+			"total_tokens":      full.TotalTokens,
+			"prompt_tokens":     full.PromptTokens,
+			"completion_tokens": full.CompletionTokens,
+			"cached_tokens":     full.CachedTokens,
+			"reasoning_tokens":  full.ReasoningTokens,
+			"calls":             full.CallCount,
+		}
+		writeJSON(w, http.StatusOK, resp)
+		return
+	}
+
+	// Historical task: aggregate this session's events.
+	store := s.usageStore
+	if store == nil {
+		store = usage.Default()
+	}
+	events, _ := store.Load("")
+	sel := make([]usage.Event, 0)
+	for _, ev := range events {
+		if ev.Session == id {
+			sel = append(sel, ev)
+		}
+	}
+	agg := usage.Aggregate(sel, usage.Today())
+	resp["is_active"] = false
+	resp["cache_hit_rate"] = agg.CacheHitRate
+	resp["cache_supported"] = agg.CacheSupported
+	resp["tokens"] = map[string]any{
+		"total_tokens":      agg.Totals.Total,
+		"prompt_tokens":     agg.Totals.Prompt,
+		"completion_tokens": agg.Totals.Completion,
+		"cached_tokens":     agg.Totals.Cached,
+		"reasoning_tokens":  agg.Totals.Reasoning,
+		"calls":             agg.Totals.Calls,
+		"turns":             agg.Totals.Turns,
+	}
+	writeJSON(w, http.StatusOK, resp)
+}
+
+// countSessions counts sessions across all projects whose start date is on or
+// after sinceDate (YYYY-MM-DD). The session index is authoritative for the
+// session count; the usage log owns token/day metrics.
+func countSessions(sinceDate string) int {
+	all, err := session.ListAllSessions()
+	if err != nil {
+		return 0
+	}
+	n := 0
+	for _, metas := range all {
+		for _, m := range metas {
+			d := m.StartTime
+			if len(d) >= 10 {
+				d = d[:10]
+			}
+			if d >= sinceDate {
+				n++
+			}
+		}
+	}
+	return n
+}
diff --git a/internal/web/usage_test.go b/internal/web/usage_test.go
new file mode 100644
index 0000000..513a81c
--- /dev/null
+++ b/internal/web/usage_test.go
@@ -0,0 +1,212 @@
+package web
+
+import (
+	"encoding/json"
+	"net/http"
+	"net/http/httptest"
+	"path/filepath"
+	"testing"
+
+	"github.com/cnjack/jcode/internal/model"
+	"github.com/cnjack/jcode/internal/session"
+	"github.com/cnjack/jcode/internal/usage"
+)
+
+func TestUsageStatsEndpoint(t *testing.T) {
+	today := usage.Today()
+	// seedIndex (from tasks_test.go) points HOME at a temp dir AND writes a
+	// session index, so countSessions sees exactly one session.
+	seedIndex(t, map[string][]session.SessionMeta{
+		"/p": {{UUID: "u1", Project: "/p", StartTime: today + "T10:00:00Z"}},
+	})
+
+	store := usage.NewStore(filepath.Join(t.TempDir(), "events.jsonl"))
+	mustRecord(t, store, usage.Event{Date: today, Model: "glm-5.2", Project: "/p", Prompt: 1000, Cached: 800, Completion: 200, Total: 1200, Calls: 2})
+	mustRecord(t, store, usage.Event{Date: today, Model: "glm-5.2", Project: "/p", Prompt: 500, Cached: 500, Completion: 50, Total: 550, Calls: 1})
+
+	s := &Server{usageStore: store}
+	rec := httptest.NewRecorder()
+	s.handleUsageStats(rec, httptest.NewRequest(http.MethodGet, "/api/usage/stats?days=7", nil))
+	if rec.Code != http.StatusOK {
+		t.Fatalf("code=%d body=%q", rec.Code, rec.Body.String())
+	}
+
+	var resp struct {
+		RangeDays int `json:"range_days"`
+		Totals    struct {
+			TotalTokens int64 `json:"total_tokens"`
+			Turns       int64 `json:"turns"`
+			Sessions    int64 `json:"sessions"`
+		} `json:"totals"`
+		ActiveDays     int              `json:"active_days"`
+		CurrentStreak  int              `json:"current_streak"`
+		MostUsedModel  string           `json:"most_used_model"`
+		CacheHitRate   float64          `json:"cache_hit_rate"`
+		CacheSupported bool             `json:"cache_supported"`
+		Heatmap        []map[string]any `json:"heatmap"`
+		ByModel        []map[string]any `json:"by_model"`
+	}
+	if err := json.Unmarshal(rec.Body.Bytes(), &resp); err != nil {
+		t.Fatalf("bad json: %v body=%q", err, rec.Body.String())
+	}
+
+	if resp.RangeDays != 7 {
+		t.Errorf("range_days = %d, want 7", resp.RangeDays)
+	}
+	if resp.Totals.TotalTokens != 1750 {
+		t.Errorf("total_tokens = %d, want 1750", resp.Totals.TotalTokens)
+	}
+	if resp.Totals.Turns != 2 {
+		t.Errorf("turns = %d, want 2", resp.Totals.Turns)
+	}
+	if resp.Totals.Sessions != 1 {
+		t.Errorf("sessions = %d, want 1", resp.Totals.Sessions)
+	}
+	if resp.ActiveDays != 1 {
+		t.Errorf("active_days = %d, want 1", resp.ActiveDays)
+	}
+	if resp.CurrentStreak != 1 {
+		t.Errorf("current_streak = %d, want 1", resp.CurrentStreak)
+	}
+	if resp.MostUsedModel != "glm-5.2" {
+		t.Errorf("most_used_model = %q, want glm-5.2", resp.MostUsedModel)
+	}
+	// cached/prompt = 1300/1500 ≈ 0.8667
+	if resp.CacheHitRate < 0.86 || resp.CacheHitRate > 0.87 {
+		t.Errorf("cache_hit_rate = %v, want ~0.8667", resp.CacheHitRate)
+	}
+	if !resp.CacheSupported {
+		t.Error("cache_supported = false, want true")
+	}
+	if len(resp.Heatmap) != 1 {
+		t.Errorf("heatmap len = %d, want 1 active day", len(resp.Heatmap))
+	}
+	if len(resp.ByModel) != 1 || resp.ByModel[0]["name"] != "glm-5.2" {
+		t.Errorf("by_model = %+v, want one glm-5.2 entry", resp.ByModel)
+	}
+}
+
+func TestUsageStatsEmpty(t *testing.T) {
+	seedIndex(t, map[string][]session.SessionMeta{})
+	s := &Server{usageStore: usage.NewStore(filepath.Join(t.TempDir(), "events.jsonl"))}
+	rec := httptest.NewRecorder()
+	s.handleUsageStats(rec, httptest.NewRequest(http.MethodGet, "/api/usage/stats", nil))
+	if rec.Code != http.StatusOK {
+		t.Fatalf("code=%d", rec.Code)
+	}
+	var resp struct {
+		RangeDays int `json:"range_days"`
+		Totals    struct {
+			TotalTokens int64 `json:"total_tokens"`
+		} `json:"totals"`
+		CacheSupported bool `json:"cache_supported"`
+	}
+	if err := json.Unmarshal(rec.Body.Bytes(), &resp); err != nil {
+		t.Fatalf("bad json: %v", err)
+	}
+	if resp.RangeDays != 30 {
+		t.Errorf("default range_days = %d, want 30", resp.RangeDays)
+	}
+	if resp.Totals.TotalTokens != 0 || resp.CacheSupported {
+		t.Errorf("empty stats should be zero/unsupported, got %+v", resp)
+	}
+}
+
+func mustRecord(t *testing.T, s *usage.Store, ev usage.Event) {
+	t.Helper()
+	if err := s.Record(ev); err != nil {
+		t.Fatalf("Record: %v", err)
+	}
+}
+
+func TestTaskStatsActive(t *testing.T) {
+	seedIndex(t, map[string][]session.SessionMeta{})
+	rec, err := session.NewRecorder(t.TempDir(), "p", "glm-5.2")
+	if err != nil {
+		t.Fatalf("NewRecorder: %v", err)
+	}
+	tu := &model.TokenUsage{}
+	tu.Add(model.AddParams{Prompt: 1000, Completion: 200, Total: 1200, Cached: 800})
+
+	s := &Server{
+		recorder:   rec,
+		tokenUsage: tu,
+		breakdownFn: func() usage.ContextBreakdown {
+			return usage.ContextBreakdown{SystemPromptTokens: 100, SystemToolsTokens: 200, MCPToolsTokens: 50, SkillsTokens: 30}
+		},
+	}
+	rr := httptest.NewRecorder()
+	req := httptest.NewRequest(http.MethodGet, "/api/tasks/"+rec.UUID()+"/stats", nil)
+	req.SetPathValue("id", rec.UUID())
+	s.handleTaskStats(rr, req)
+	if rr.Code != http.StatusOK {
+		t.Fatalf("code=%d body=%q", rr.Code, rr.Body.String())
+	}
+	var resp struct {
+		IsActive bool `json:"is_active"`
+		Context  struct {
+			SystemPromptTokens int `json:"system_prompt_tokens"`
+			MessagesTokens     int `json:"messages_tokens"`
+		} `json:"context"`
+		CacheHitRate   float64 `json:"cache_hit_rate"`
+		CacheSupported bool    `json:"cache_supported"`
+	}
+	if err := json.Unmarshal(rr.Body.Bytes(), &resp); err != nil {
+		t.Fatalf("bad json: %v", err)
+	}
+	if !resp.IsActive {
+		t.Error("is_active = false, want true for the current session")
+	}
+	// messages = lastPrompt(1000) - static(100+200+50+30=380) = 620
+	if resp.Context.MessagesTokens != 620 {
+		t.Errorf("messages_tokens = %d, want 620", resp.Context.MessagesTokens)
+	}
+	if resp.Context.SystemPromptTokens != 100 {
+		t.Errorf("system_prompt_tokens = %d, want 100", resp.Context.SystemPromptTokens)
+	}
+	if resp.CacheHitRate != 0.8 || !resp.CacheSupported {
+		t.Errorf("cache = (%v, %v), want (0.8, true)", resp.CacheHitRate, resp.CacheSupported)
+	}
+}
+
+func TestTaskStatsHistorical(t *testing.T) {
+	today := usage.Today()
+	store := usage.NewStore(filepath.Join(t.TempDir(), "events.jsonl"))
+	mustRecord(t, store, usage.Event{Date: today, Session: "sess-A", Model: "m", Prompt: 1000, Cached: 700, Completion: 100, Total: 1100, Calls: 1})
+	mustRecord(t, store, usage.Event{Date: today, Session: "sess-A", Model: "m", Prompt: 500, Cached: 300, Completion: 50, Total: 550, Calls: 1})
+	mustRecord(t, store, usage.Event{Date: today, Session: "sess-B", Model: "m", Prompt: 999, Cached: 0, Completion: 9, Total: 1008, Calls: 1})
+
+	// No recorder → every query is treated as historical.
+	s := &Server{usageStore: store}
+	rr := httptest.NewRecorder()
+	req := httptest.NewRequest(http.MethodGet, "/api/tasks/sess-A/stats", nil)
+	req.SetPathValue("id", "sess-A")
+	s.handleTaskStats(rr, req)
+	if rr.Code != http.StatusOK {
+		t.Fatalf("code=%d", rr.Code)
+	}
+	var resp struct {
+		IsActive bool `json:"is_active"`
+		Tokens   struct {
+			TotalTokens int64 `json:"total_tokens"`
+			Turns       int64 `json:"turns"`
+		} `json:"tokens"`
+		CacheHitRate float64 `json:"cache_hit_rate"`
+	}
+	if err := json.Unmarshal(rr.Body.Bytes(), &resp); err != nil {
+		t.Fatalf("bad json: %v", err)
+	}
+	if resp.IsActive {
+		t.Error("is_active = true, want false")
+	}
+	if resp.Tokens.TotalTokens != 1650 {
+		t.Errorf("total_tokens = %d, want 1650 (only sess-A)", resp.Tokens.TotalTokens)
+	}
+	if resp.Tokens.Turns != 2 {
+		t.Errorf("turns = %d, want 2", resp.Tokens.Turns)
+	}
+	// cached/prompt = 1000/1500 ≈ 0.6667
+	if resp.CacheHitRate < 0.66 || resp.CacheHitRate > 0.67 {
+		t.Errorf("cache_hit_rate = %v, want ~0.6667", resp.CacheHitRate)
+	}
+}
diff --git a/web/src/components/ChatInput.vue b/web/src/components/ChatInput.vue
index 2ff34a9..69b8f4f 100644
--- a/web/src/components/ChatInput.vue
+++ b/web/src/components/ChatInput.vue
@@ -6,6 +6,7 @@ import { api } from '@/composables/api'
 import type { SlashCommandInfo, ChatImage } from '@/types/api'
 import WorkspacePicker from '@/components/WorkspacePicker.vue'
 import BranchPicker from '@/components/BranchPicker.vue'
+import ContextCapacityPopup from '@/components/ContextCapacityPopup.vue'
 import { HandRaisedIcon, ShieldExclamationIcon, ClipboardDocumentListIcon, BoltIcon, PlusIcon, PaperClipIcon, XMarkIcon, ChevronDownIcon, StopIcon, PaperAirplaneIcon, MagnifyingGlassIcon, SquaresPlusIcon, PhotoIcon, WrenchScrewdriverIcon, CheckIcon, StarIcon, SparklesIcon } from '@heroicons/vue/24/outline'
 import { StarIcon as StarIconSolid, CheckCircleIcon } from '@heroicons/vue/24/solid'
 
@@ -23,6 +24,16 @@ const textarea = ref<HTMLTextAreaElement | null>(null)
 const showModelPicker = ref(false)
 const showModePicker = ref(false)
 const showAddMenu = ref(false)
+const showContextPopup = ref(false)
+
+// Context-fill ring on the composer: the orange arc fills with the % of the
+// context window in use, turning red as it approaches the limit.
+const ctxRingCirc = 2 * Math.PI * 6.4
+const ctxRingOffset = computed(() => {
+  const p = Math.min(100, Math.max(0, store.tokenPercentage))
+  return ctxRingCirc * (1 - p / 100)
+})
+const ctxRingColor = computed(() => (store.tokenPercentage >= 90 ? '#E24B4A' : 'var(--color-primary)'))
 const showManageModels = ref(false)
 const modelFilter = ref('')
 const containerRef = ref<HTMLDivElement | null>(null)
@@ -387,6 +398,7 @@ function handleClickOutside(e: MouseEvent) {
     showModePicker.value = false
     showAddMenu.value = false
     showSlashMenu.value = false
+    showContextPopup.value = false
     if (showManageModels.value) {
       showManageModels.value = false
       modelFilter.value = ''
@@ -412,6 +424,11 @@ function handleGlobalKey(e: KeyboardEvent) {
       showModelPicker.value = false
       return
     }
+    if (showContextPopup.value) {
+      e.preventDefault()
+      showContextPopup.value = false
+      return
+    }
   }
 }
 
@@ -444,6 +461,7 @@ watch(() => store.currentSessionId, () => {
   showModelPicker.value = false
   showModePicker.value = false
   showAddMenu.value = false
+  showContextPopup.value = false
   showManageModels.value = false
 })
 
@@ -627,9 +645,26 @@ watch(() => store.imageSupport, (supported) => {
           </div>
 
           <div class="toolbar-right">
-            <span v-if="store.tokenInfo" class="token-count">
-              {{ store.tokenInfo.total_tokens.toLocaleString() }} tokens
-            </span>
+            <div v-if="store.tokenInfo && store.tokenInfo.total_tokens > 0 && store.hasMessages" class="relative">
+              <button
+                type="button"
+                class="token-count token-count-btn ctx-trigger"
+                :title="t('contextCapacity.title')"
+                @click.stop="showContextPopup = !showContextPopup; showModelPicker = false; showModePicker = false; showAddMenu = false"
+              >
+                <svg class="ctx-ring" width="17" height="17" viewBox="0 0 16 16" aria-hidden="true">
+                  <circle cx="8" cy="8" r="6.4" fill="none" stroke="color-mix(in srgb, var(--color-foreground) 20%, transparent)" stroke-width="2.2" />
+                  <circle
+                    cx="8" cy="8" r="6.4" fill="none"
+                    :stroke="ctxRingColor" stroke-width="2.2" stroke-linecap="round"
+                    :stroke-dasharray="ctxRingCirc" :stroke-dashoffset="ctxRingOffset"
+                    transform="rotate(-90 8 8)"
+                  />
+                </svg>
+                <span class="tabular-nums">{{ store.tokenPercentage }}%</span>
+              </button>
+              <ContextCapacityPopup v-if="showContextPopup" class="absolute bottom-full right-0 mb-2 z-50" />
+            </div>
 
             <!-- Model selector (moved to the right, near Send). The trigger shows
                  the active provider's identity tile so the brand is readable at rest. -->
@@ -1966,6 +2001,32 @@ watch(() => store.imageSupport, (supported) => {
   font-family: var(--font-mono);
   color: var(--color-muted-foreground);
 }
+/* Clickable variant: opens the context-capacity popup. */
+.token-count-btn {
+  background: none;
+  border: none;
+  padding: 2px 5px;
+  border-radius: var(--radius-sm);
+  cursor: pointer;
+  transition: background var(--duration-fast), color var(--duration-fast);
+}
+.token-count-btn:hover {
+  background: var(--color-secondary);
+  color: var(--color-foreground);
+}
+/* Context-fill ring + percentage. */
+.ctx-trigger {
+  display: inline-flex;
+  align-items: center;
+  gap: 5px;
+}
+.ctx-ring {
+  display: block;
+  transition: stroke-dashoffset var(--duration-normal, 0.3s) ease;
+}
+.ctx-ring circle:last-child {
+  transition: stroke-dashoffset var(--duration-normal, 0.3s) ease;
+}
 
 /* Send & Stop buttons */
 .send-btn {
diff --git a/web/src/components/ContextCapacityPopup.vue b/web/src/components/ContextCapacityPopup.vue
new file mode 100644
index 0000000..f0aeb6e
--- /dev/null
+++ b/web/src/components/ContextCapacityPopup.vue
@@ -0,0 +1,164 @@
+<script setup lang="ts">
+import { computed, onMounted } from 'vue'
+import { useI18n } from 'vue-i18n'
+import { storeToRefs } from 'pinia'
+import { useChatStore } from '@/stores/chat'
+import { useUsageStore } from '@/stores/usage'
+
+const { t, locale } = useI18n()
+const chat = useChatStore()
+const usage = useUsageStore()
+const { taskStats, taskLoading } = storeToRefs(usage)
+
+onMounted(() => {
+  if (chat.currentSessionId) usage.fetchTaskStats(chat.currentSessionId)
+})
+
+function fmtCompact(n: number): string {
+  try {
+    return new Intl.NumberFormat(locale.value, { notation: 'compact', maximumFractionDigits: 1 }).format(n)
+  } catch {
+    return String(n)
+  }
+}
+function fmtPct(frac: number): string {
+  return `${(frac * 100).toFixed(1)}%`
+}
+
+// Bucket palette: an accent gradient for the model-driven parts, a neutral for
+// the conversation-independent system prompt. Keeps the theme's orange identity.
+const BUCKETS = [
+  { key: 'messages', field: 'messages_tokens', color: 'color-mix(in srgb, var(--color-primary) 90%, transparent)' },
+  { key: 'systemTools', field: 'system_tools_tokens', color: 'color-mix(in srgb, var(--color-primary) 60%, transparent)' },
+  { key: 'mcpTools', field: 'mcp_tools_tokens', color: 'color-mix(in srgb, var(--color-primary) 38%, transparent)' },
+  { key: 'skills', field: 'skills_tokens', color: 'color-mix(in srgb, var(--color-primary) 22%, transparent)' },
+  { key: 'systemPrompt', field: 'system_prompt_tokens', color: 'color-mix(in srgb, var(--color-foreground) 30%, transparent)' },
+] as const
+
+const FREE_COLOR = 'color-mix(in srgb, var(--color-foreground) 7%, transparent)'
+
+// model_context_limit comes live over the WS; fall back to the per-task fetch.
+const limit = computed(() => chat.tokenInfo?.model_context_limit ?? taskStats.value?.context?.context_limit ?? 0)
+const hasWindow = computed(() => limit.value > 0)
+
+const rawRows = computed(() => {
+  const ctx = taskStats.value?.context
+  if (!ctx) return []
+  return BUCKETS.map((b) => ({
+    key: b.key,
+    color: b.color,
+    tokens: (ctx as unknown as Record<string, number>)[b.field] ?? 0,
+  })).filter((r) => r.tokens > 0)
+})
+
+const usedTokens = computed(() => rawRows.value.reduce((s, r) => s + r.tokens, 0))
+const freeTokens = computed(() => Math.max(0, limit.value - usedTokens.value))
+
+// Fractions are of the FULL context window when the window is known (matching
+// the canonical "context window" view), otherwise of the used total so the bar
+// still reads on models with no published limit.
+const denom = computed(() => (hasWindow.value ? limit.value : usedTokens.value || 1))
+const rows = computed(() => rawRows.value.map((r) => ({ ...r, frac: r.tokens / denom.value })))
+const freeFrac = computed(() => (hasWindow.value ? freeTokens.value / denom.value : 0))
+const usedPct = computed(() => (hasWindow.value ? Math.round((usedTokens.value / limit.value) * 100) : 0))
+
+// Cumulative tokens this conversation has consumed (input+output across turns).
+const sessionTotal = computed(() => taskStats.value?.tokens?.total_tokens ?? 0)
+
+const cachePct = computed<number | null>(() => {
+  const live = chat.cacheHitPercentage
+  if (live != null) return live
+  const ts = taskStats.value
+  if (ts && ts.cache_supported) return Math.round(ts.cache_hit_rate * 100)
+  return null
+})
+</script>
+
+<template>
+  <div class="ctx-popup" @click.stop>
+    <div class="flex items-baseline justify-between gap-3">
+      <span class="text-[12px] font-semibold" style="color: var(--color-foreground)">{{ t('contextCapacity.title') }}</span>
+      <span class="text-[11px] tabular-nums" style="color: var(--color-muted-foreground)">
+        {{ fmtCompact(usedTokens) }}<span v-if="hasWindow"> / {{ fmtCompact(limit) }}</span>
+        <span v-if="usedPct"> · {{ usedPct }}%</span>
+      </span>
+    </div>
+
+    <!-- Stacked composition bar over the free-space track -->
+    <div v-if="rows.length" class="ctx-bar mt-2.5" :style="{ background: FREE_COLOR }">
+      <div
+        v-for="seg in rows"
+        :key="seg.key"
+        class="ctx-seg"
+        :style="{ width: `${seg.frac * 100}%`, background: seg.color }"
+        :title="`${t('contextCapacity.' + seg.key)} · ${fmtCompact(seg.tokens)} · ${fmtPct(seg.frac)}`"
+      />
+    </div>
+
+    <!-- Per-category rows: absolute tokens + share of the context window -->
+    <div v-if="rows.length" class="mt-2.5 space-y-1">
+      <div v-for="seg in rows" :key="seg.key" class="flex items-center gap-2 text-[11px]">
+        <span class="inline-block w-2.5 h-2.5 rounded-[3px] shrink-0" :style="{ background: seg.color }" />
+        <span class="flex-1 truncate" style="color: var(--color-foreground)">{{ t('contextCapacity.' + seg.key) }}</span>
+        <span class="tabular-nums" style="color: var(--color-muted-foreground)">{{ fmtCompact(seg.tokens) }}</span>
+        <span class="w-12 text-right tabular-nums" style="color: var(--color-muted-foreground)">{{ fmtPct(seg.frac) }}</span>
+      </div>
+      <!-- Free space, like the canonical context-window view -->
+      <div v-if="hasWindow" class="flex items-center gap-2 text-[11px]">
+        <span class="inline-block w-2.5 h-2.5 rounded-[3px] shrink-0" :style="{ background: FREE_COLOR }" />
+        <span class="flex-1" style="color: var(--color-muted-foreground)">{{ t('contextCapacity.freeSpace') }}</span>
+        <span class="tabular-nums" style="color: var(--color-muted-foreground)">{{ fmtCompact(freeTokens) }}</span>
+        <span class="w-12 text-right tabular-nums" style="color: var(--color-muted-foreground)">{{ fmtPct(freeFrac) }}</span>
+      </div>
+    </div>
+
+    <div v-else-if="taskLoading" class="mt-2 text-[11px]" style="color: var(--color-muted-foreground)">
+      {{ t('common.loading') }}
+    </div>
+
+    <div class="ctx-divider" />
+
+    <!-- Conversation-level totals -->
+    <div class="flex items-center justify-between text-[11px]">
+      <span style="color: var(--color-muted-foreground)">{{ t('contextCapacity.cacheHitRate') }}</span>
+      <span class="font-semibold tabular-nums" :style="{ color: cachePct != null ? 'var(--color-primary)' : 'var(--color-muted-foreground)' }">
+        {{ cachePct != null ? cachePct + '%' : '—' }}
+      </span>
+    </div>
+    <div v-if="sessionTotal > 0" class="flex items-center justify-between text-[11px] mt-1.5">
+      <span style="color: var(--color-muted-foreground)">{{ t('contextCapacity.sessionTotal') }}</span>
+      <span class="tabular-nums" style="color: var(--color-foreground)">{{ fmtCompact(sessionTotal) }}</span>
+    </div>
+
+    <div class="mt-2 text-[10px]" style="color: var(--color-muted-foreground)">{{ t('contextCapacity.estimated') }}</div>
+  </div>
+</template>
+
+<style scoped>
+.ctx-popup {
+  width: 290px;
+  padding: 12px 14px;
+  border-radius: var(--radius-md);
+  background: var(--color-background);
+  border: 1px solid var(--color-border);
+  box-shadow: var(--elevation-popover, 0 8px 24px rgba(0, 0, 0, 0.16));
+}
+.ctx-bar {
+  display: flex;
+  height: 10px;
+  width: 100%;
+  border-radius: 5px;
+  overflow: hidden;
+}
+.ctx-seg {
+  height: 100%;
+}
+.ctx-seg + .ctx-seg {
+  border-left: 1px solid var(--color-background);
+}
+.ctx-divider {
+  height: 1px;
+  background: var(--color-border);
+  margin: 10px 0;
+}
+</style>
diff --git a/web/src/components/SettingsDialog.vue b/web/src/components/SettingsDialog.vue
index 7a2e19d..e5aecf9 100644
--- a/web/src/components/SettingsDialog.vue
+++ b/web/src/components/SettingsDialog.vue
@@ -38,8 +38,10 @@ import {
   ArrowLeftIcon,
   ChevronDownIcon,
   CheckIcon,
+  ChartBarIcon,
 } from '@heroicons/vue/24/outline'
 import { isTauri } from '@/composables/useDesktop'
+import UsageStatsPanel from '@/components/UsageStatsPanel.vue'
 import { useI18n } from 'vue-i18n'
 import { SUPPORTED_LOCALES, LOCALE_LABELS, setLocale, i18n, type SupportedLocale } from '@/i18n'
 
@@ -98,7 +100,7 @@ function connectToAlias(alias: SSHAlias) {
 const { themeChoice, setTheme, themes } = useTheme()
 const darkThemes = computed(() => themes.filter((t) => t.appearance === 'dark'))
 const lightThemes = computed(() => themes.filter((t) => t.appearance === 'light'))
-const activeTab = ref<'general' | 'appearance' | 'providers' | 'mcp' | 'skills' | 'ssh' | 'channels' | 'shortcuts'>('general')
+const activeTab = ref<'general' | 'appearance' | 'providers' | 'mcp' | 'skills' | 'ssh' | 'channels' | 'shortcuts' | 'usage'>('general')
 const mcpServers = ref<Record<string, MCPServerInfo>>({})
 const sshAliases = ref<SSHAlias[]>([])
 const sshCurrent = ref('local')
@@ -571,6 +573,7 @@ const tabLabel = computed<Record<string, string>>(() => ({
   ssh: t('settings.tabs.ssh'),
   channels: t('settings.tabs.channels'),
   shortcuts: t('settings.tabs.shortcuts'),
+  usage: t('settings.tabs.usage'),
 }))
 
 // Nav-rail + empty-state icons. One heroicons component per section (was a
@@ -585,6 +588,7 @@ const iconFor: Record<string, Component> = {
   ssh: CommandLineIcon,
   channels: BellAlertIcon,
   shortcuts: ComputerDesktopIcon,
+  usage: ChartBarIcon,
 }
 
 
@@ -695,7 +699,7 @@ const addProviderInfo = () => addProviderList.value.find(p => p.id === addSelect
             <nav class="settings-rail shrink-0 flex flex-col">
               <div class="flex flex-col gap-0.5">
                 <button
-                  v-for="tab in (['general', 'appearance', 'providers', 'mcp', 'skills', 'ssh', 'channels', 'shortcuts'] as const)"
+                  v-for="tab in (['general', 'appearance', 'providers', 'mcp', 'skills', 'ssh', 'channels', 'shortcuts', 'usage'] as const)"
                   :key="tab"
                   class="group relative w-full flex items-center gap-2.5 h-8 pl-2.5 pr-2 text-left text-[13px] cursor-pointer transition-colors duration-[var(--duration-fast)] hover:bg-[var(--color-secondary)]"
                   :style="activeTab === tab
@@ -1488,6 +1492,9 @@ const addProviderInfo = () => addProviderList.value.find(p => p.id === addSelect
                     </div>
                   </div>
                 </div>
+
+                <!-- Usage statistics (lazily rendered: the panel fetches on mount) -->
+                <UsageStatsPanel v-if="activeTab === 'usage'" />
                 </div>
               </div>
             </div>
diff --git a/web/src/components/UsageStatsPanel.vue b/web/src/components/UsageStatsPanel.vue
new file mode 100644
index 0000000..02cbc5e
--- /dev/null
+++ b/web/src/components/UsageStatsPanel.vue
@@ -0,0 +1,398 @@
+<script setup lang="ts">
+import { computed, onMounted } from 'vue'
+import { useI18n } from 'vue-i18n'
+import { storeToRefs } from 'pinia'
+import { useUsageStore } from '@/stores/usage'
+import type { UsageDayBucket } from '@/types/api'
+
+const { t, locale } = useI18n()
+const usage = useUsageStore()
+const { stats, loading, error, rangeDays } = storeToRefs(usage)
+
+onMounted(() => {
+  if (!stats.value) usage.fetchStats()
+})
+
+// --- formatting --------------------------------------------------------------
+
+function fmtCompact(n: number): string {
+  try {
+    return new Intl.NumberFormat(locale.value, { notation: 'compact', maximumFractionDigits: 1 }).format(n)
+  } catch {
+    return String(n)
+  }
+}
+function fmtFull(n: number): string {
+  try {
+    return new Intl.NumberFormat(locale.value).format(n)
+  } catch {
+    return String(n)
+  }
+}
+function fmtPct(frac: number): string {
+  return `${Math.round(frac * 100)}%`
+}
+
+// Parse a local YYYY-MM-DD without UTC shifting.
+function parseLocal(s: string): Date {
+  const [y = 1970, m = 1, d = 1] = s.split('-').map(Number)
+  return new Date(y, m - 1, d)
+}
+function toKey(d: Date): string {
+  const m = String(d.getMonth() + 1).padStart(2, '0')
+  const day = String(d.getDate()).padStart(2, '0')
+  return `${d.getFullYear()}-${m}-${day}`
+}
+function fmtDayLabel(s: string): string {
+  try {
+    return new Intl.DateTimeFormat(locale.value, { month: 'short', day: 'numeric' }).format(parseLocal(s))
+  } catch {
+    return s
+  }
+}
+
+// --- derived cards -----------------------------------------------------------
+
+const cacheLabel = computed(() => {
+  const s = stats.value
+  if (!s || !s.cache_supported) return '—'
+  return fmtPct(s.cache_hit_rate)
+})
+const modelShare = computed(() => {
+  const s = stats.value
+  if (!s || !s.by_model.length) return null
+  return s.by_model[0]?.share ?? null
+})
+
+// --- heatmap (53 weeks x 7 days, ending today) -------------------------------
+
+const HEAT_WEEKS = 53
+const CELL = 11
+const GAP = 3
+const STEP = CELL + GAP
+
+interface HeatCell {
+  date: string
+  tokens: number
+  turns: number
+  level: number // 0-4
+  future: boolean
+}
+
+const heatmap = computed(() => {
+  const buckets = new Map<string, UsageDayBucket>()
+  let max = 0
+  for (const b of stats.value?.heatmap ?? []) {
+    buckets.set(b.date, b)
+    if (b.tokens > max) max = b.tokens
+  }
+  const today = new Date()
+  today.setHours(0, 0, 0, 0)
+  // Sunday of the current week is the top of the final column.
+  const lastSunday = new Date(today)
+  lastSunday.setDate(today.getDate() - today.getDay())
+  const start = new Date(lastSunday)
+  start.setDate(lastSunday.getDate() - (HEAT_WEEKS - 1) * 7)
+
+  const cols: HeatCell[][] = []
+  for (let w = 0; w < HEAT_WEEKS; w++) {
+    const col: HeatCell[] = []
+    for (let d = 0; d < 7; d++) {
+      const cur = new Date(start)
+      cur.setDate(start.getDate() + w * 7 + d)
+      const key = toKey(cur)
+      const b = buckets.get(key)
+      const tokens = b?.tokens ?? 0
+      col.push({
+        date: key,
+        tokens,
+        turns: b?.turns ?? 0,
+        level: levelFor(tokens, max),
+        future: cur.getTime() > today.getTime(),
+      })
+    }
+    cols.push(col)
+  }
+  return cols
+})
+
+function levelFor(tokens: number, max: number): number {
+  if (tokens <= 0 || max <= 0) return 0
+  // Log scale so a few huge days don't flatten everything else.
+  const r = Math.log(tokens + 1) / Math.log(max + 1)
+  return Math.min(4, Math.max(1, Math.ceil(r * 4)))
+}
+
+const HEAT_FILL = [
+  'color-mix(in srgb, var(--color-foreground) 7%, transparent)',
+  'color-mix(in srgb, var(--color-primary) 28%, transparent)',
+  'color-mix(in srgb, var(--color-primary) 48%, transparent)',
+  'color-mix(in srgb, var(--color-primary) 72%, transparent)',
+  'var(--color-primary)',
+]
+const heatWidth = HEAT_WEEKS * STEP - GAP
+const heatHeight = 7 * STEP - GAP
+
+function fillFor(c: HeatCell): string {
+  if (c.future) return 'transparent'
+  return HEAT_FILL[c.level] ?? 'transparent'
+}
+
+function cellTitle(c: HeatCell): string {
+  if (c.future) return ''
+  if (c.tokens <= 0) return `${fmtDayLabel(c.date)} · ${t('settings.usageStats.noActivity')}`
+  return `${fmtDayLabel(c.date)} · ${fmtCompact(c.tokens)} tokens · ${c.turns} ${t('settings.usageStats.turnsUnit')}`
+}
+
+// --- daily trend -------------------------------------------------------------
+
+const trend = computed(() => stats.value?.daily_trend ?? [])
+const trendMax = computed(() => Math.max(1, ...trend.value.map((d) => d.tokens)))
+function barTitle(d: UsageDayBucket): string {
+  return `${fmtDayLabel(d.date)} · ${fmtCompact(d.tokens)} tokens · ${d.turns} ${t('settings.usageStats.turnsUnit')}`
+}
+
+function setRange(days: number) {
+  if (rangeDays.value === days && stats.value) return
+  usage.fetchStats(days)
+}
+function shortName(path: string): string {
+  const parts = path.replace(/\/+$/, '').split('/')
+  return parts[parts.length - 1] || path
+}
+</script>
+
+<template>
+  <div class="space-y-5">
+    <div class="flex items-center justify-between gap-3 flex-wrap">
+      <div>
+        <h3 class="text-[13px] font-semibold tracking-tight" style="color: var(--color-foreground)">
+          {{ t('settings.usageStats.title') }}
+        </h3>
+        <p class="text-[11px] mt-0.5" style="color: var(--color-muted-foreground)">
+          {{ t('settings.usageStats.subtitle') }}
+        </p>
+      </div>
+      <!-- Range toggle -->
+      <div class="inline-flex p-0.5 rounded-md" style="background: var(--color-secondary)">
+        <button
+          v-for="d in [7, 30]"
+          :key="d"
+          class="px-2.5 h-7 text-[12px] rounded transition-colors cursor-pointer"
+          :style="rangeDays === d
+            ? { background: 'var(--color-background)', color: 'var(--color-foreground)', fontWeight: '500' }
+            : { background: 'transparent', color: 'var(--color-muted-foreground)' }"
+          @click="setRange(d)"
+        >
+          {{ t('settings.usageStats.lastNDays', { n: d }) }}
+        </button>
+      </div>
+    </div>
+
+    <div v-if="error" class="text-[12px] px-3 py-2 rounded-md" style="background: var(--color-warning-bg); color: var(--color-warning-fg)">
+      {{ error }}
+    </div>
+
+    <div v-else-if="loading && !stats" class="text-[12px] py-8 text-center" style="color: var(--color-muted-foreground)">
+      {{ t('common.loading') }}
+    </div>
+
+    <template v-else-if="stats">
+      <!-- Stat cards -->
+      <div class="grid grid-cols-2 sm:grid-cols-3 gap-2.5">
+        <div class="us-card us-card-lg">
+          <div class="us-label">{{ t('settings.usageStats.totalTokens') }}</div>
+          <div class="us-value" :title="fmtFull(stats.totals.total_tokens)">{{ fmtCompact(stats.totals.total_tokens) }}</div>
+        </div>
+        <div class="us-card us-card-lg">
+          <div class="us-label">{{ t('settings.usageStats.cacheHitRate') }}</div>
+          <div class="us-value" style="color: var(--color-primary)">{{ cacheLabel }}</div>
+        </div>
+        <div class="us-card us-card-lg">
+          <div class="us-label">{{ t('settings.usageStats.mostUsedModel') }}</div>
+          <div class="us-value us-value-sm" :title="stats.most_used_model">{{ stats.most_used_model || '—' }}</div>
+          <div v-if="modelShare != null" class="us-sub">{{ t('settings.usageStats.share', { pct: fmtPct(modelShare) }) }}</div>
+        </div>
+        <div class="us-card">
+          <div class="us-label">{{ t('settings.usageStats.sessions') }}</div>
+          <div class="us-value">{{ fmtFull(stats.totals.sessions) }}</div>
+        </div>
+        <div class="us-card">
+          <div class="us-label">{{ t('settings.usageStats.turns') }}</div>
+          <div class="us-value">{{ fmtFull(stats.totals.turns) }}</div>
+        </div>
+        <div class="us-card">
+          <div class="us-label">{{ t('settings.usageStats.activeDays') }}</div>
+          <div class="us-value">{{ stats.active_days }}</div>
+          <div class="us-sub">{{ t('settings.usageStats.streak', { n: stats.current_streak }) }}</div>
+        </div>
+      </div>
+
+      <!-- Token composition strip -->
+      <div class="us-panel">
+        <div class="us-panel-title">{{ t('settings.usageStats.tokenBreakdown') }}</div>
+        <div class="grid grid-cols-2 sm:grid-cols-4 gap-2.5 mt-2">
+          <div>
+            <div class="us-mini-label">{{ t('settings.usageStats.promptTokens') }}</div>
+            <div class="us-mini-value">{{ fmtCompact(stats.totals.prompt_tokens) }}</div>
+          </div>
+          <div>
+            <div class="us-mini-label">{{ t('settings.usageStats.cachedTokens') }}</div>
+            <div class="us-mini-value">{{ fmtCompact(stats.totals.cached_tokens) }}</div>
+          </div>
+          <div>
+            <div class="us-mini-label">{{ t('settings.usageStats.completionTokens') }}</div>
+            <div class="us-mini-value">{{ fmtCompact(stats.totals.completion_tokens) }}</div>
+          </div>
+          <div>
+            <div class="us-mini-label">{{ t('settings.usageStats.reasoningTokens') }}</div>
+            <div class="us-mini-value">{{ fmtCompact(stats.totals.reasoning_tokens) }}</div>
+          </div>
+        </div>
+      </div>
+
+      <!-- Activity heatmap -->
+      <div class="us-panel">
+        <div class="flex items-center justify-between">
+          <div class="us-panel-title">{{ t('settings.usageStats.heatmap') }}</div>
+          <div class="flex items-center gap-1 text-[10px]" style="color: var(--color-muted-foreground)">
+            <span>{{ t('settings.usageStats.less') }}</span>
+            <span v-for="(f, i) in HEAT_FILL" :key="i" class="inline-block rounded-[2px]" :style="{ width: '10px', height: '10px', background: f }" />
+            <span>{{ t('settings.usageStats.more') }}</span>
+          </div>
+        </div>
+        <div class="overflow-x-auto mt-2 pb-1">
+          <svg :viewBox="`0 0 ${heatWidth} ${heatHeight}`" :width="heatWidth" :height="heatHeight" role="img" :aria-label="t('settings.usageStats.heatmap')">
+            <template v-for="(col, w) in heatmap" :key="w">
+              <rect
+                v-for="(c, d) in col"
+                :key="d"
+                :x="w * STEP"
+                :y="d * STEP"
+                :width="CELL"
+                :height="CELL"
+                rx="2"
+                :fill="fillFor(c)"
+              >
+                <title>{{ cellTitle(c) }}</title>
+              </rect>
+            </template>
+          </svg>
+        </div>
+      </div>
+
+      <!-- Daily trend -->
+      <div class="us-panel">
+        <div class="us-panel-title">{{ t('settings.usageStats.dailyTrend') }}</div>
+        <div v-if="trend.length" class="flex items-end gap-[3px] mt-3" style="height: 120px">
+          <div
+            v-for="d in trend"
+            :key="d.date"
+            class="flex-1 rounded-t-[2px] transition-[height] min-w-[2px]"
+            :style="{ height: `${Math.max(2, (d.tokens / trendMax) * 100)}%`, background: 'var(--accent-fill)' }"
+            :title="barTitle(d)"
+          />
+        </div>
+        <div v-else class="text-[11px] py-6 text-center" style="color: var(--color-muted-foreground)">
+          {{ t('settings.usageStats.noData') }}
+        </div>
+      </div>
+
+      <!-- Breakdown bars -->
+      <div class="grid grid-cols-1 sm:grid-cols-2 gap-3">
+        <div class="us-panel">
+          <div class="us-panel-title">{{ t('settings.usageStats.byModel') }}</div>
+          <div class="space-y-2 mt-2">
+            <div v-for="m in stats.by_model.slice(0, 6)" :key="m.name" class="us-bar-row">
+              <div class="flex items-center justify-between text-[11px] mb-1">
+                <span class="truncate" style="color: var(--color-foreground)">{{ m.name }}</span>
+                <span style="color: var(--color-muted-foreground)">{{ fmtCompact(m.tokens) }}</span>
+              </div>
+              <div class="us-bar-track"><div class="us-bar-fill" :style="{ width: fmtPct(m.share) }" /></div>
+            </div>
+            <div v-if="!stats.by_model.length" class="us-empty">{{ t('settings.usageStats.noData') }}</div>
+          </div>
+        </div>
+        <div class="us-panel">
+          <div class="us-panel-title">{{ t('settings.usageStats.byProject') }}</div>
+          <div class="space-y-2 mt-2">
+            <div v-for="p in stats.by_project.slice(0, 6)" :key="p.name" class="us-bar-row">
+              <div class="flex items-center justify-between text-[11px] mb-1">
+                <span class="truncate" style="color: var(--color-foreground)" :title="p.name">{{ shortName(p.name) }}</span>
+                <span style="color: var(--color-muted-foreground)">{{ fmtCompact(p.tokens) }}</span>
+              </div>
+              <div class="us-bar-track"><div class="us-bar-fill" :style="{ width: fmtPct(p.share) }" /></div>
+            </div>
+            <div v-if="!stats.by_project.length" class="us-empty">{{ t('settings.usageStats.noData') }}</div>
+          </div>
+        </div>
+      </div>
+    </template>
+  </div>
+</template>
+
+<style scoped>
+.us-card {
+  border-radius: var(--radius-md);
+  background: var(--color-secondary);
+  padding: 12px 14px;
+}
+.us-label {
+  font-size: 11px;
+  color: var(--color-muted-foreground);
+  margin-bottom: 4px;
+}
+.us-value {
+  font-size: 24px;
+  font-weight: 600;
+  line-height: 1.1;
+  color: var(--color-foreground);
+}
+.us-value-sm {
+  font-size: 16px;
+  overflow: hidden;
+  text-overflow: ellipsis;
+  white-space: nowrap;
+}
+.us-sub {
+  font-size: 10px;
+  color: var(--color-muted-foreground);
+  margin-top: 4px;
+}
+.us-panel {
+  border-radius: var(--radius-md);
+  background: var(--color-secondary);
+  padding: 14px;
+}
+.us-panel-title {
+  font-size: 12px;
+  font-weight: 600;
+  color: var(--color-foreground);
+}
+.us-mini-label {
+  font-size: 10px;
+  color: var(--color-muted-foreground);
+}
+.us-mini-value {
+  font-size: 15px;
+  font-weight: 600;
+  color: var(--color-foreground);
+  margin-top: 2px;
+}
+.us-bar-track {
+  height: 6px;
+  border-radius: 3px;
+  background: color-mix(in srgb, var(--color-foreground) 8%, transparent);
+  overflow: hidden;
+}
+.us-bar-fill {
+  height: 100%;
+  border-radius: 3px;
+  background: var(--color-primary);
+}
+.us-empty {
+  font-size: 11px;
+  color: var(--color-muted-foreground);
+  padding: 8px 0;
+}
+</style>
diff --git a/web/src/composables/api.ts b/web/src/composables/api.ts
index cafbd1a..cef40c8 100644
--- a/web/src/composables/api.ts
+++ b/web/src/composables/api.ts
@@ -1,5 +1,5 @@
 // API client for jcode backend
-import type { ModelsResponse, AgentMode, ExecResponse, DiffResponse, WorkspaceInfo, GitBranchesResponse, TaskItem, TaskMetaPatch, MCPListResponse, MCPServerRequest, MCPLoginStatus, BrowseResponse, SSHListResponse, SkillInfo, SlashCommandInfo, TodoItem, Goal, SessionItem, SessionEntry, FileItem, SetupProvider, SetupModel, ProviderDetail, ModelStateResponse, ChatImage, AskUserAnswer, AskUserRequestData, ApprovalRequestData, RemoteConnectRequest, RemoteConnectResponse, RemoteListDirResponse, RemoteBindResponse } from '@/types/api'
+import type { ModelsResponse, AgentMode, ExecResponse, DiffResponse, WorkspaceInfo, GitBranchesResponse, TaskItem, TaskMetaPatch, MCPListResponse, MCPServerRequest, MCPLoginStatus, BrowseResponse, SSHListResponse, SkillInfo, SlashCommandInfo, TodoItem, Goal, SessionItem, SessionEntry, FileItem, SetupProvider, SetupModel, ProviderDetail, ModelStateResponse, ChatImage, AskUserAnswer, AskUserRequestData, ApprovalRequestData, RemoteConnectRequest, RemoteConnectResponse, RemoteListDirResponse, RemoteBindResponse, UsageStats, TaskStats, TokenUpdateData } from '@/types/api'
 import { apiBase } from './apiBase'
 
 async function request<T>(path: string, options?: RequestInit): Promise<T> {
@@ -35,8 +35,11 @@ export const api = {
       provider: string
       model: string
       mode: string
+      token?: TokenUpdateData
     }>('/api/status'),
   config: () => request<{ provider: string; model: string; max_iterations: number }>('/api/config'),
+  usageStats: (days = 30) => request<UsageStats>(`/api/usage/stats?days=${days}`),
+  taskStats: (id: string) => request<TaskStats>(`/api/tasks/${encodeURIComponent(id)}/stats`),
   todos: () => request<TodoItem[]>('/api/todos'),
   goal: () => request<Goal | null>('/api/goal'),
   setGoal: (objective: string, start = true) =>
diff --git a/web/src/i18n/locales/en.ts b/web/src/i18n/locales/en.ts
index 01172de..5fd6068 100644
--- a/web/src/i18n/locales/en.ts
+++ b/web/src/i18n/locales/en.ts
@@ -214,6 +214,7 @@ export default {
       ssh: 'SSH',
       channels: 'Channels',
       shortcuts: 'Shortcuts',
+      usage: 'Usage',
     },
     general: {
       serverOnline: 'Online',
@@ -360,6 +361,46 @@ export default {
         toggleTerminal: 'Toggle terminal',
       },
     },
+    usageStats: {
+      title: 'Usage Statistics',
+      subtitle: 'Token usage, sessions and cache efficiency across all projects.',
+      lastNDays: 'Last {n} days',
+      totalTokens: 'Tokens used',
+      cacheHitRate: 'Cache hit rate',
+      mostUsedModel: 'Top model',
+      share: '{pct} of tokens',
+      sessions: 'Sessions',
+      turns: 'Turns',
+      activeDays: 'Active days',
+      streak: '{n}-day streak',
+      tokenBreakdown: 'Token breakdown',
+      promptTokens: 'Input',
+      cachedTokens: 'Cached',
+      completionTokens: 'Output',
+      reasoningTokens: 'Reasoning',
+      heatmap: 'Activity heatmap',
+      less: 'Less',
+      more: 'More',
+      dailyTrend: 'Daily tokens',
+      byModel: 'By model',
+      byProject: 'By project',
+      noData: 'No data yet',
+      noActivity: 'No activity',
+      turnsUnit: 'turns',
+    },
+  },
+
+  contextCapacity: {
+    title: 'Context capacity',
+    messages: 'Messages',
+    systemTools: 'System tools',
+    mcpTools: 'MCP tools',
+    skills: 'Skills',
+    systemPrompt: 'System prompt',
+    cacheHitRate: 'Cache hit rate',
+    freeSpace: 'Free space',
+    sessionTotal: 'Conversation total',
+    estimated: 'Breakdown is estimated (~4 bytes/token).',
   },
 
   setup: {
diff --git a/web/src/i18n/locales/ja.ts b/web/src/i18n/locales/ja.ts
index 7fe32d0..88b57aa 100644
--- a/web/src/i18n/locales/ja.ts
+++ b/web/src/i18n/locales/ja.ts
@@ -203,6 +203,7 @@ export default {
       ssh: 'SSH',
       channels: 'チャンネル',
       shortcuts: 'ショートカット',
+      usage: '使用状況',
     },
     general: {
       serverOnline: 'オンライン',
@@ -349,6 +350,46 @@ export default {
         toggleTerminal: 'ターミナルを切り替え',
       },
     },
+    usageStats: {
+      title: '使用状況',
+      subtitle: 'すべてのプロジェクトのトークン使用量・セッション・キャッシュ効率。',
+      lastNDays: '過去 {n} 日',
+      totalTokens: 'トークン使用量',
+      cacheHitRate: 'キャッシュ率',
+      mostUsedModel: '最多モデル',
+      share: '割合 {pct}',
+      sessions: 'セッション',
+      turns: 'ターン',
+      activeDays: 'アクティブ日数',
+      streak: '連続 {n} 日',
+      tokenBreakdown: 'トークン内訳',
+      promptTokens: '入力',
+      cachedTokens: 'キャッシュ',
+      completionTokens: '出力',
+      reasoningTokens: '推論',
+      heatmap: 'アクティビティ',
+      less: '少',
+      more: '多',
+      dailyTrend: '日別トークン',
+      byModel: 'モデル別',
+      byProject: 'プロジェクト別',
+      noData: 'データなし',
+      noActivity: 'アクティビティなし',
+      turnsUnit: 'ターン',
+    },
+  },
+
+  contextCapacity: {
+    title: 'コンテキスト容量',
+    messages: 'メッセージ',
+    systemTools: 'システムツール',
+    mcpTools: 'MCP ツール',
+    skills: 'スキル',
+    systemPrompt: 'システムプロンプト',
+    cacheHitRate: 'キャッシュ率',
+    freeSpace: '空き容量',
+    sessionTotal: '会話の累計',
+    estimated: '内訳は概算です(約 4 バイト/トークン)。',
   },
 
   setup: {
diff --git a/web/src/i18n/locales/ko.ts b/web/src/i18n/locales/ko.ts
index e4680b5..3626ad1 100644
--- a/web/src/i18n/locales/ko.ts
+++ b/web/src/i18n/locales/ko.ts
@@ -203,6 +203,7 @@ export default {
       ssh: 'SSH',
       channels: '채널',
       shortcuts: '단축키',
+      usage: '사용 통계',
     },
     general: {
       serverOnline: '온라인',
@@ -349,6 +350,46 @@ export default {
         toggleTerminal: '터미널 전환',
       },
     },
+    usageStats: {
+      title: '사용 통계',
+      subtitle: '모든 프로젝트의 토큰 사용량, 세션, 캐시 효율.',
+      lastNDays: '최근 {n}일',
+      totalTokens: '토큰 사용량',
+      cacheHitRate: '캐시 적중률',
+      mostUsedModel: '최다 모델',
+      share: '비중 {pct}',
+      sessions: '세션',
+      turns: '턴',
+      activeDays: '활동 일수',
+      streak: '{n}일 연속',
+      tokenBreakdown: '토큰 구성',
+      promptTokens: '입력',
+      cachedTokens: '캐시',
+      completionTokens: '출력',
+      reasoningTokens: '추론',
+      heatmap: '활동 히트맵',
+      less: '적음',
+      more: '많음',
+      dailyTrend: '일별 토큰',
+      byModel: '모델별',
+      byProject: '프로젝트별',
+      noData: '데이터 없음',
+      noActivity: '활동 없음',
+      turnsUnit: '턴',
+    },
+  },
+
+  contextCapacity: {
+    title: '컨텍스트 용량',
+    messages: '메시지',
+    systemTools: '시스템 도구',
+    mcpTools: 'MCP 도구',
+    skills: '스킬',
+    systemPrompt: '시스템 프롬프트',
+    cacheHitRate: '캐시 적중률',
+    freeSpace: '여유 공간',
+    sessionTotal: '대화 누적',
+    estimated: '구성은 추정치입니다(약 4바이트/토큰).',
   },
 
   setup: {
diff --git a/web/src/i18n/locales/zh-Hans.ts b/web/src/i18n/locales/zh-Hans.ts
index d9d1acf..92859b8 100644
--- a/web/src/i18n/locales/zh-Hans.ts
+++ b/web/src/i18n/locales/zh-Hans.ts
@@ -203,6 +203,7 @@ export default {
       ssh: 'SSH',
       channels: '渠道',
       shortcuts: '快捷键',
+      usage: '使用统计',
     },
     general: {
       serverOnline: '在线',
@@ -349,6 +350,46 @@ export default {
         toggleTerminal: '切换终端',
       },
     },
+    usageStats: {
+      title: '使用统计',
+      subtitle: '所有项目的 token 用量、会话与缓存效率。',
+      lastNDays: '最近 {n} 天',
+      totalTokens: 'Token 用量',
+      cacheHitRate: '缓存命中率',
+      mostUsedModel: '最常用模型',
+      share: '占比 {pct}',
+      sessions: '会话数量',
+      turns: '对话轮次',
+      activeDays: '活跃天数',
+      streak: '连续 {n} 天',
+      tokenBreakdown: 'Token 构成',
+      promptTokens: '输入',
+      cachedTokens: '缓存',
+      completionTokens: '输出',
+      reasoningTokens: '推理',
+      heatmap: '活跃热力图',
+      less: '较少',
+      more: '较多',
+      dailyTrend: '按天 Token',
+      byModel: '按模型',
+      byProject: '按项目',
+      noData: '暂无数据',
+      noActivity: '无活动',
+      turnsUnit: '轮',
+    },
+  },
+
+  contextCapacity: {
+    title: '上下文容量',
+    messages: '消息',
+    systemTools: '系统工具',
+    mcpTools: 'MCP 工具',
+    skills: '技能',
+    systemPrompt: '系统提示词',
+    cacheHitRate: '缓存命中率',
+    freeSpace: '剩余空间',
+    sessionTotal: '本会话累计',
+    estimated: '构成为估算值(约 4 字节/token)。',
   },
 
   setup: {
diff --git a/web/src/i18n/locales/zh-Hant.ts b/web/src/i18n/locales/zh-Hant.ts
index d46eb1b..be0a2b5 100644
--- a/web/src/i18n/locales/zh-Hant.ts
+++ b/web/src/i18n/locales/zh-Hant.ts
@@ -204,6 +204,7 @@ export default {
       ssh: 'SSH',
       channels: '頻道',
       shortcuts: '快捷鍵',
+      usage: '使用統計',
     },
     general: {
       serverOnline: '上線',
@@ -350,6 +351,46 @@ export default {
         toggleTerminal: '切換終端機',
       },
     },
+    usageStats: {
+      title: '使用統計',
+      subtitle: '所有專案的 token 用量、工作階段與快取效率。',
+      lastNDays: '最近 {n} 天',
+      totalTokens: 'Token 用量',
+      cacheHitRate: '快取命中率',
+      mostUsedModel: '最常用模型',
+      share: '佔比 {pct}',
+      sessions: '工作階段',
+      turns: '對話輪次',
+      activeDays: '活躍天數',
+      streak: '連續 {n} 天',
+      tokenBreakdown: 'Token 構成',
+      promptTokens: '輸入',
+      cachedTokens: '快取',
+      completionTokens: '輸出',
+      reasoningTokens: '推理',
+      heatmap: '活躍熱力圖',
+      less: '較少',
+      more: '較多',
+      dailyTrend: '每日 Token',
+      byModel: '依模型',
+      byProject: '依專案',
+      noData: '尚無資料',
+      noActivity: '無活動',
+      turnsUnit: '輪',
+    },
+  },
+
+  contextCapacity: {
+    title: '上下文容量',
+    messages: '訊息',
+    systemTools: '系統工具',
+    mcpTools: 'MCP 工具',
+    skills: '技能',
+    systemPrompt: '系統提示詞',
+    cacheHitRate: '快取命中率',
+    freeSpace: '剩餘空間',
+    sessionTotal: '本對話累計',
+    estimated: '構成為估算值(約 4 位元組/token)。',
   },
 
   setup: {
diff --git a/web/src/stores/chat.ts b/web/src/stores/chat.ts
index cad8fe1..48353e8 100644
--- a/web/src/stores/chat.ts
+++ b/web/src/stores/chat.ts
@@ -111,6 +111,14 @@ export const useChatStore = defineStore('chat', () => {
     if (!tokenInfo.value || !tokenInfo.value.model_context_limit) return 0
     return Math.round((tokenInfo.value.total_tokens / tokenInfo.value.model_context_limit) * 100)
   })
+  // Aggregate KV cache hit rate (0-100), or null when the provider never
+  // reported caching so the UI can render "—" instead of a misleading 0%.
+  const cacheHitPercentage = computed<number | null>(() => {
+    const t = tokenInfo.value
+    if (!t || t.cache_supported === false) return null
+    if (t.cache_hit_rate == null) return null
+    return Math.round(t.cache_hit_rate * 100)
+  })
   const projectName = computed(() => {
     const p = pwd.value
     if (!p) return ''
@@ -675,6 +683,14 @@ export const useChatStore = defineStore('chat', () => {
       isRunning.value = h.running || false
       imageSupport.value = h.image_support || false
       serverVersion.value = h.version || ''
+      // Seed the live context indicator so it's visible at rest / after a page
+      // reload, not only after the first turn completes. Fire-and-forget.
+      api
+        .status()
+        .then((s) => {
+          if (s.token) tokenInfo.value = s.token
+        })
+        .catch(() => {})
       return h
     } catch (err) {
       console.error('Failed to fetch health:', err)
@@ -1009,6 +1025,36 @@ export const useChatStore = defineStore('chat', () => {
       // Re-attach any question/approval still awaiting a response on the server.
       await reconcileAskUser()
       await reconcileApprovals()
+
+      // Seed the context indicator so the ring shows immediately on resume.
+      // total = the live static buckets (system prompt + tools + MCP + skills)
+      // plus a ~4-bytes/token estimate of the loaded history. The next real turn
+      // replaces this with the exact prompt token count from token_update.
+      try {
+        let chars = 0
+        for (const e of entries) {
+          chars += (e.content?.length || 0) + (e.args?.length || 0) + (e.output?.length || 0)
+        }
+        const msgTokens = Math.ceil(chars / 4)
+        const ts = await api.taskStats(currentSessionId.value)
+        const c = ts.context
+        const staticTokens = c
+          ? c.system_prompt_tokens + c.system_tools_tokens + c.mcp_tools_tokens + c.skills_tokens
+          : 0
+        const limit = c?.context_limit || tokenInfo.value?.model_context_limit || 0
+        const total = staticTokens + msgTokens
+        if (limit > 0 && total > 0) {
+          tokenInfo.value = {
+            total_tokens: total,
+            prompt_tokens: total,
+            completion_tokens: 0,
+            model_context_limit: limit,
+            cache_supported: false,
+          }
+        }
+      } catch {
+        // Best-effort: leave the ring hidden until the next turn populates it.
+      }
     } catch (err: unknown) {
       addMessage('system', i18n.global.t('errors.loadSession', { detail: err instanceof Error ? err.message : String(err) }))
     }
@@ -1042,6 +1088,7 @@ export const useChatStore = defineStore('chat', () => {
     hasMessages,
     activeTodos,
     tokenPercentage,
+    cacheHitPercentage,
     projectName,
     // Actions
     addMessage,
diff --git a/web/src/stores/usage.ts b/web/src/stores/usage.ts
new file mode 100644
index 0000000..ce2a956
--- /dev/null
+++ b/web/src/stores/usage.ts
@@ -0,0 +1,44 @@
+import { defineStore } from 'pinia'
+import { ref } from 'vue'
+import { api } from '@/composables/api'
+import type { UsageStats, TaskStats } from '@/types/api'
+
+// Global usage-statistics store. Kept separate from chat.ts so the stats page
+// (a lazily-rendered Settings tab) doesn't bloat the hot chat store.
+export const useUsageStore = defineStore('usage', () => {
+  const stats = ref<UsageStats | null>(null)
+  const loading = ref(false)
+  const error = ref<string | null>(null)
+  const rangeDays = ref(30)
+
+  // Per-task context-capacity stats, keyed by session UUID.
+  const taskStats = ref<TaskStats | null>(null)
+  const taskLoading = ref(false)
+
+  async function fetchTaskStats(uuid: string) {
+    if (!uuid) return
+    taskLoading.value = true
+    try {
+      taskStats.value = await api.taskStats(uuid)
+    } catch {
+      taskStats.value = null
+    } finally {
+      taskLoading.value = false
+    }
+  }
+
+  async function fetchStats(days = rangeDays.value) {
+    loading.value = true
+    error.value = null
+    rangeDays.value = days
+    try {
+      stats.value = await api.usageStats(days)
+    } catch (e) {
+      error.value = e instanceof Error ? e.message : String(e)
+    } finally {
+      loading.value = false
+    }
+  }
+
+  return { stats, loading, error, rangeDays, fetchStats, taskStats, taskLoading, fetchTaskStats }
+})
diff --git a/web/src/types/api.ts b/web/src/types/api.ts
index ba9ab18..a5a3ae5 100644
--- a/web/src/types/api.ts
+++ b/web/src/types/api.ts
@@ -309,9 +309,17 @@ export interface ToolResultData {
 }
 
 export interface TokenUpdateData {
+  // total_tokens is current context occupancy (last call); the cumulative
+  // counters + cache_hit_rate cover the whole session.
+  total_tokens: number
   prompt_tokens: number
   completion_tokens: number
-  total_tokens: number
+  cached_tokens?: number
+  reasoning_tokens?: number
+  cache_write_tokens?: number
+  call_count?: number
+  cache_hit_rate?: number
+  cache_supported?: boolean
   model_context_limit: number
 }
 
@@ -319,6 +327,73 @@ export interface AgentDoneData {
   error?: string
 }
 
+// --- Usage statistics ---
+
+export interface UsageDayBucket {
+  date: string // YYYY-MM-DD
+  tokens: number
+  turns: number
+  calls: number
+}
+
+export interface UsageShare {
+  name: string
+  tokens: number
+  share: number // 0-1 fraction of grand total
+}
+
+export interface UsageTotals {
+  total_tokens: number
+  prompt_tokens: number
+  completion_tokens: number
+  cached_tokens: number
+  reasoning_tokens: number
+  calls: number
+  turns: number
+  sessions: number
+}
+
+export interface TaskContextBreakdown {
+  context_limit: number
+  system_prompt_tokens: number
+  system_tools_tokens: number
+  mcp_tools_tokens: number
+  skills_tokens: number
+  messages_tokens: number
+}
+
+export interface TaskStats {
+  uuid: string
+  is_active: boolean
+  context?: TaskContextBreakdown
+  cache_hit_rate: number
+  cache_supported: boolean
+  tokens: {
+    total_tokens: number
+    prompt_tokens: number
+    completion_tokens: number
+    cached_tokens: number
+    reasoning_tokens: number
+    calls: number
+    turns?: number
+  }
+}
+
+export interface UsageStats {
+  range_days: number
+  totals: UsageTotals
+  active_days: number
+  current_streak: number
+  longest_streak: number
+  most_used_model: string
+  cache_hit_rate: number // 0-1
+  cache_supported: boolean
+  heatmap: UsageDayBucket[] // fixed ~365-day window
+  daily_trend: UsageDayBucket[] // selected range
+  by_model: UsageShare[]
+  by_project: UsageShare[]
+}
+
 export interface ApprovalRequestData {
   id: string
   tool_name: string