Skip to content

feat(orchestration): token/cost budget stop condition (max_total_tokens, max_cost_usd) #475

@windoliver

Description

@windoliver

Summary

The budget stop condition caps a session by contribution count and wall-clock seconds only. It cannot cap by tokens or dollars — the one Stage-5 loop primitive that's both absent and the most consequential, since "the loop is now the expensive part" (the failure mode behind runaway billing). Add max_total_tokens and max_cost_usd to the budget stop condition.

The accounting half already exists — this is a wiring task, not new infrastructure.

Current state

  • Budget schema: BudgetSchema (src/core/contract.ts:89) → Budget interface (src/core/contract.ts:615) — only max_contributions / max_wall_clock_seconds.
  • Evaluator: evaluateBudget() (src/core/stop-conditions.ts:283) — checks count + wall-clock, OR-combined.
  • Usage is already reported and aggregated:
    • grove_report_usage MCP tool (src/mcp/tools/messaging.ts:155) → reportUsage() (src/core/operations/cost-tracking.ts:85) stores usage_report (input/output/cache tokens, optional cost_usd) as ephemeral discussion contributions.
    • getSessionCosts() (src/core/operations/cost-tracking.ts:142) already returns SessionCostSummary { totalInputTokens, totalOutputTokens, totalCostUsd, byAgent }.
  • The gap: evaluateBudget() never consults getSessionCosts(). The cost data exists but the loop is blind to it.

Proposed change

  1. Schema (src/core/contract.ts):
    • BudgetSchema (~:89): add max_total_tokens: z.number().int().min(1).optional() and max_cost_usd: z.number().min(0).optional(); update the .refine(...) to accept any one of the four.
    • Budget interface (~:615): add maxTotalTokens? / maxCostUsd?.
    • wireToStopConditions mapper (~:900): map the two new snake_case fields.
  2. Evaluator (src/core/stop-conditions.ts:283):
    • evaluateBudget() already takes the ContributionStore; getSessionCosts() takes the same store. Call it, compare totalInputTokens + totalOutputTokens against maxTotalTokens and totalCostUsd against maxCostUsd, OR-combine with the existing met and add reasons/details (tokens_used/limit, cost_used/limit).
  3. Recipe materializer (src/core/recipe.ts:907): thread the two new fields through materializeRecipeStopConditions.
  4. TUI: surface "budget remaining" alongside the per-agent token cost panel (feat(tui): per-agent token cost panel — surface credit/token usage per session #366 is the display side of the same data).

Acceptance criteria

  • A GROVE.md / contract with budget: { max_total_tokens: N } or { max_cost_usd: X } stops the session once reported usage crosses the threshold, with stop reason Budget exceeded: tokens: … >= N / cost: $… >= $X.
  • evaluateStopConditions exposes the new conditions in its conditions.budget.details.
  • Backward compatible: existing count/wall-clock budgets unchanged; all four are optional and OR-combined.
  • Unit tests in stop-conditions.test.ts for token-only, cost-only, and combined caps.

Known limitations (call out in PR, do not silently ship)

  • Self-reported trust boundary. Enforcement is only as good as grove_report_usage — an agent that never reports can't be capped. Honest hardening (follow-up): source usage from the ACP runtime / acpx transcript (AcpxSupervisor) rather than trusting the agent's self-report. Note this limitation in the schema docs.
  • Between-round granularity. This is a stop condition checked between rounds, not a hard mid-iteration kill — a single runaway iteration can overshoot before the next evaluation. A true hard ceiling belongs with the supervisor/deadline-watcher (src/core/deadline-watcher.ts) — tracked in feat(orchestration): supervisor-enforced hard ceiling — abort runaway agent mid-iteration #476.

Context

Completes the third of the article's "three hard stops" (max-iterations ✓ via loop-runner.ts maxIterations, no-progress ✓ via maxNoImprovementRounds/Plateau, $/token budget ✗). Related: #366 (per-agent token cost panel — display side), #376 (Run Health model + metrics API), Epic F #347 (admission + backpressure).

Metadata

Metadata

Assignees

No one assigned

    Labels

    agent-orchestrationMulti-agent spawning, coordination, and lifecyclearea-foundationEpic A: Entity + watch protocolenhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions