Skip to content

Gateway bots lack a runtime control surface — no in-chat /model, /usage, /compress or /queue commands #2158

Description

@MervinPraison

Summary

PraisonAI gateway bots expose only five built-in slash commands (/help, /status, /new, /stop, /whoami). Users running a long-lived bot cannot switch the model mid-conversation, see their token/cost usage, compress an over-long conversation to stay within the context window, or queue follow-up messages — all common, high-value controls for a production chat bot. The underlying capabilities largely exist in core (per-call model override, context optimisation/compaction, usage tracking via hooks); they are simply not wired to an in-chat control surface.

Current behaviour

# src/praisonai/praisonai/bots/_commands.py — _initialize_builtin_commands()
self.register("help",   {"description": "Show help message", "builtin": True})
self.register("status", {"description": "Show bot status", "builtin": True})
self.register("new",    {"description": "Reset conversation session", "builtin": True})
self.register("stop",   {"description": "Cancel current agent task", "builtin": True})
self.register("whoami", {"description": "Show your user info and permissions", "builtin": True})

Custom commands can be registered programmatically via CommandRegistry.register(...), and access is already gated by CommandAccessPolicy — but there are no built-ins for model switching, usage/cost, context compression or message queuing. The enabling primitives already exist in core:

  • context compaction/optimisation: praisonaiagents/context/optimizer.py (SummarizeOptimizer, LLMSummarizeOptimizer), context/manager.py
  • token/cost tracking: BEFORE_LLM/AFTER_LLM hooks in praisonaiagents/hooks/types.py
  • per-call model override on Agent.chat

Desired behaviour

A standard, opt-in set of built-in gateway commands, each gated by the existing CommandAccessPolicy:

  • /model <name> — switch the LLM for this session (session-scoped override)
  • /usage — show tokens used and estimated cost for the session
  • /compress — summarise earlier turns to free context window
  • /queue <text> — enqueue a follow-up to run after the current turn
  • (and /retry, /undo where transcript state allows)

Layer placement

  • Primary layer: wrapper — these are gateway/bot control-surface commands wired into the CommandRegistry.
  • Why not core: the underlying operations (compaction, usage, model override) belong in core and already exist; the chat command surface is a wrapper concern.
  • Why not tools: these are operator controls over the session, not agent-callable task integrations.
  • Why not plugins: they are first-class, always-available bot UX, not optional cross-cutting policy.
  • Secondary touch (optional): core — expose small Session.usage_snapshot() and Session.compact() conveniences so the wrapper handlers stay thin.
  • 3-way surface (CLI + YAML + Python): yes — the commands live in chat; YAML CommandAccessPolicy gates them; Python registers/overrides handlers via CommandRegistry.

Proposed approach

  • Extension point: add built-in handlers in bots/_commands.py that call existing core APIs; keep them opt-out via config.
  • Minimal API sketch:
@bot.on_command("model")
async def _model(ctx, name):
    ctx.session.set_model(name)          # session-scoped override
    return f"Model switched to {name} for this conversation."

@bot.on_command("usage")
async def _usage(ctx):
    u = ctx.session.usage_snapshot()     # tokens + cost from hooks
    return f"Tokens: {u.total} · Est. cost: ${u.cost:.4f}"

Resolution sketch

# Before (today)
/model gpt-4o   -> "Unknown command"
/usage          -> "Unknown command"
/compress       -> "Unknown command"

# After (proposed)
/model gpt-4o   -> "Model switched to gpt-4o for this conversation."
/usage          -> "Tokens: 18,204 · Est. cost: $0.0431"
/compress       -> "Summarised 42 earlier messages; ~12k tokens freed."

Severity

High — for a long-running, multi-user gateway these controls are table stakes; their absence forces users to start a new session (/new) and lose context simply to change model or recover from context-window overflow.

Validation

Read bots/_commands.py (_initialize_builtin_commands registers exactly five commands; CommandAccessPolicy already present). Confirmed via search that no /model, /usage, /cost, /compress or /queue handler exists anywhere under bots/. Verified the enabling primitives already exist in core (context/optimizer.py, context/manager.py, hooks/types.py LLM token hooks, per-call model override on Agent.chat).

Metadata

Metadata

Assignees

No one assigned

    Labels

    claudeAuto-trigger Claude analysisenhancementNew feature or requestquestionFurther information is requested

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions