Description
The mimoModels definition in packages/types/src/providers/mimo.ts lacks an input token count for reasoning_content tokens. When MiMo uses deep thinking (enabled via extra_body.thinking), the reasoning tokens produced by the model are streamed as reasoning_content deltas — but these tokens are not tracked in the usage/cost calculation.
Current Behavior
// src/api/providers/mimo.ts:130-135
if ("reasoning_content" in delta && delta.reasoning_content) {
yield {
type: "reasoning",
text: (delta.reasoning_content as string) || "",
}
}
The reasoning content is yielded as a stream event, but in the usage block:
// src/api/providers/mimo.ts:144-166
if (lastUsage) {
const inputTokens = lastUsage?.prompt_tokens || 0
const outputTokens = lastUsage?.completion_tokens || 0
const cacheWriteTokens = (lastUsage?.prompt_tokens_details as any)?.cache_write_tokens || 0
const cacheReadTokens = lastUsage?.prompt_tokens_details?.cached_tokens || 0
// No reasoning_tokens tracked!
}
What other providers do
DeepSeek handler (src/api/providers/deepseek.ts:189) explicitly tracks reasoningTokens from completion_tokens_details.reasoning_tokens. The MiMo handler does not.
Proposed Fix
- Add
reasoningTokens field to ApiStreamUsageChunk if not already present
- Extract reasoning token count from MiMo usage response:
// In MimoHandler.createMessage usage block:
const reasoningTokens = (lastUsage?.completion_tokens_details as any)?.reasoning_tokens || 0
yield {
type: "usage",
inputTokens,
outputTokens,
reasoningTokens, // NEW: track thinking cost separately
cacheWriteTokens: cacheWriteTokens || undefined,
cacheReadTokens: cacheReadTokens || undefined,
totalCost,
}
- Update cost calculation in
calculateApiCostOpenAI to account for reasoning tokens if MiMo charges differently for them
Impact
- Severity: Medium — Users see inaccurate cost tracking for thinking-heavy conversations
- Scope:
src/api/providers/mimo.ts, potentially src/shared/cost.ts and packages/types/
- Risk of fix: Low — additive change, does not break existing behavior
Additional Context
MiMo V2.5 Pro documentation indicates thinking mode produces reasoning tokens that may be billed separately. Without tracking, users have no visibility into the cost of deep thinking vs. direct output.
Description
The
mimoModelsdefinition inpackages/types/src/providers/mimo.tslacks an input token count forreasoning_contenttokens. When MiMo uses deep thinking (enabled viaextra_body.thinking), the reasoning tokens produced by the model are streamed asreasoning_contentdeltas — but these tokens are not tracked in the usage/cost calculation.Current Behavior
The reasoning content is yielded as a stream event, but in the usage block:
What other providers do
DeepSeek handler (
src/api/providers/deepseek.ts:189) explicitly tracksreasoningTokensfromcompletion_tokens_details.reasoning_tokens. The MiMo handler does not.Proposed Fix
reasoningTokensfield toApiStreamUsageChunkif not already presentcalculateApiCostOpenAIto account for reasoning tokens if MiMo charges differently for themImpact
src/api/providers/mimo.ts, potentiallysrc/shared/cost.tsandpackages/types/Additional Context
MiMo V2.5 Pro documentation indicates thinking mode produces reasoning tokens that may be billed separately. Without tracking, users have no visibility into the cost of deep thinking vs. direct output.