feat(frontend): show AI chat context-size usage and fix trim gate by Guilhem-lm · Pull Request #9490 · windmill-labs/windmill

Guilhem-lm · 2026-06-09T14:55:43Z

Summary

Surfaces the AI chat's current context-window usage as a 45k / 200k badge next to the model selector, and makes the conversation-trim gate accurate.

The provider-reported token usage was already computed end-to-end (per provider → accumulated in runChatLoop) but dropped on the floor. This PR captures the final loop iteration's usage (prompt + completion) as an accurate "current context size" anchor, persists it per chat, and displays it.

It also fixes a real overflow bug: the existing trim gate estimated tokens as chars÷4 over message content only — ignoring the system message and tool schemas (tens of thousands of tokens in agentic mode), so it under-counted and could let context overflow the model window → provider 400s. The gate now counts those and calibrates the crude estimate against the accurate anchor.

Changes

chatLoop.ts: add lastIterationUsage to ChatLoopResult — the final iteration's usage (last-write-wins across all 4 provider branches). Distinct from the existing summed tokenUsage (= total billed, unchanged).
AIChatManager.svelte.ts: new contextTokens state (= lastIterationUsage.total), captured after runChatLoop. Rewrite the trim gate: #crudeEstimate now includes the system message + tool schemas; #estimateContextTokens scales the crude estimate by contextTokens / anchorCrude when an anchor exists, else falls back to crude. Export getTrimThreshold, MAX_TOKENS_THRESHOLD_PERCENTAGE, MAX_TOKENS_HARD_LIMIT. Reset on new chat, restore on loadPastChat.
HistoryManager.svelte.ts: persist optional contextTokens per chat in IndexedDB (no version bump — optional field, old chats read back undefined).
AIChatDisplay.svelte: 45k / 200k badge beside ProviderModelSelector, hidden until a real count exists, colour shifts neutral→amber→red near the trim threshold.
AIChatManager.test.ts: 4 new gate tests.

Test plan

npm run check:fast — clean for all changed files
npx vitest run AIChatManager.test.ts — 11/11 pass (system-message counting, tool-schema counting, paired assistant+tool eviction on trim, anchor calibration)
svelte-autofixer — zero issues on the badge code
Open the AI chat on a real model, run a multi-turn agentic conversation; confirm the badge appears after the first turn, climbs across turns, and shifts amber→red near the trim threshold
Restore a chat saved before this feature (no stored contextTokens) — badge stays hidden, chat still sends
Confirm a 1M-window model (gpt-4.1/gemini) renders the denominator as 1M

🤖 Generated with Claude Code

Surface the AI chat's current context-window usage as a "45k / 200k" badge next to the model selector, and make the conversation-trim gate accurate. The provider-reported token usage was already computed end-to-end but dropped. Capture the final loop iteration's usage (prompt + completion) as an accurate "current context size" anchor, persist it per chat in IndexedDB, and display it. Re-base the trim gate on the same anchor: the crude chars-per-4 estimate now includes the system message and tool schemas (previously ignored, which let real context overflow the model window), and is calibrated against the accurate anchor when one exists. The badge colour shifts neutral->amber->red as usage approaches the threshold where older messages start being dropped. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

cloudflare-workers-and-pages · 2026-06-09T15:02:26Z

Deploying windmill with Cloudflare Pages

Latest commit:	`1ccf2b7`
Status:	✅ Deploy successful!
Preview URL:	https://0974c385.windmill.pages.dev
Branch Preview URL:	https://glm-ai-chat-token.windmill.pages.dev

View logs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(frontend): show AI chat context-size usage and fix trim gate#9490

feat(frontend): show AI chat context-size usage and fix trim gate#9490
Guilhem-lm wants to merge 1 commit into
mainfrom
glm/ai-chat-token

Guilhem-lm commented Jun 9, 2026

Uh oh!

cloudflare-workers-and-pages Bot commented Jun 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Guilhem-lm commented Jun 9, 2026

Summary

Changes

Test plan

Uh oh!

cloudflare-workers-and-pages Bot commented Jun 9, 2026

Deploying windmill with Cloudflare Pages

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant