feat(frontend): show AI chat context-size usage and fix trim gate#9490
Draft
Guilhem-lm wants to merge 1 commit into
Draft
feat(frontend): show AI chat context-size usage and fix trim gate#9490Guilhem-lm wants to merge 1 commit into
Guilhem-lm wants to merge 1 commit into
Conversation
Surface the AI chat's current context-window usage as a "45k / 200k" badge next to the model selector, and make the conversation-trim gate accurate. The provider-reported token usage was already computed end-to-end but dropped. Capture the final loop iteration's usage (prompt + completion) as an accurate "current context size" anchor, persist it per chat in IndexedDB, and display it. Re-base the trim gate on the same anchor: the crude chars-per-4 estimate now includes the system message and tool schemas (previously ignored, which let real context overflow the model window), and is calibrated against the accurate anchor when one exists. The badge colour shifts neutral->amber->red as usage approaches the threshold where older messages start being dropped. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Deploying windmill with
|
| Latest commit: |
1ccf2b7
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://0974c385.windmill.pages.dev |
| Branch Preview URL: | https://glm-ai-chat-token.windmill.pages.dev |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Surfaces the AI chat's current context-window usage as a
45k / 200kbadge next to the model selector, and makes the conversation-trim gate accurate.The provider-reported token usage was already computed end-to-end (per provider → accumulated in
runChatLoop) but dropped on the floor. This PR captures the final loop iteration's usage (prompt + completion) as an accurate "current context size" anchor, persists it per chat, and displays it.It also fixes a real overflow bug: the existing trim gate estimated tokens as chars÷4 over message content only — ignoring the system message and tool schemas (tens of thousands of tokens in agentic mode), so it under-counted and could let context overflow the model window → provider 400s. The gate now counts those and calibrates the crude estimate against the accurate anchor.
Changes
chatLoop.ts: addlastIterationUsagetoChatLoopResult— the final iteration's usage (last-write-wins across all 4 provider branches). Distinct from the existing summedtokenUsage(= total billed, unchanged).AIChatManager.svelte.ts: newcontextTokensstate (=lastIterationUsage.total), captured afterrunChatLoop. Rewrite the trim gate:#crudeEstimatenow includes the system message + tool schemas;#estimateContextTokensscales the crude estimate bycontextTokens / anchorCrudewhen an anchor exists, else falls back to crude. ExportgetTrimThreshold,MAX_TOKENS_THRESHOLD_PERCENTAGE,MAX_TOKENS_HARD_LIMIT. Reset on new chat, restore onloadPastChat.HistoryManager.svelte.ts: persist optionalcontextTokensper chat in IndexedDB (no version bump — optional field, old chats read backundefined).AIChatDisplay.svelte:45k / 200kbadge besideProviderModelSelector, hidden until a real count exists, colour shifts neutral→amber→red near the trim threshold.AIChatManager.test.ts: 4 new gate tests.Test plan
npm run check:fast— clean for all changed filesnpx vitest run AIChatManager.test.ts— 11/11 pass (system-message counting, tool-schema counting, paired assistant+tool eviction on trim, anchor calibration)contextTokens) — badge stays hidden, chat still sends1M🤖 Generated with Claude Code