Gateway bots lack a runtime control surface — no in-chat /model, /usage, /compress or /queue commands

## Summary
PraisonAI gateway bots expose only five built-in slash commands (`/help`, `/status`, `/new`, `/stop`, `/whoami`). Users running a long-lived bot cannot switch the model mid-conversation, see their token/cost usage, compress an over-long conversation to stay within the context window, or queue follow-up messages — all common, high-value controls for a production chat bot. The underlying capabilities largely exist in core (per-call model override, context optimisation/compaction, usage tracking via hooks); they are simply not wired to an in-chat control surface.

## Current behaviour
```python
# src/praisonai/praisonai/bots/_commands.py — _initialize_builtin_commands()
self.register("help",   {"description": "Show help message", "builtin": True})
self.register("status", {"description": "Show bot status", "builtin": True})
self.register("new",    {"description": "Reset conversation session", "builtin": True})
self.register("stop",   {"description": "Cancel current agent task", "builtin": True})
self.register("whoami", {"description": "Show your user info and permissions", "builtin": True})
```
Custom commands can be registered programmatically via `CommandRegistry.register(...)`, and access is already gated by `CommandAccessPolicy` — but there are no built-ins for model switching, usage/cost, context compression or message queuing. The enabling primitives already exist in core:
- context compaction/optimisation: `praisonaiagents/context/optimizer.py` (`SummarizeOptimizer`, `LLMSummarizeOptimizer`), `context/manager.py`
- token/cost tracking: `BEFORE_LLM`/`AFTER_LLM` hooks in `praisonaiagents/hooks/types.py`
- per-call model override on `Agent.chat`

## Desired behaviour
A standard, opt-in set of built-in gateway commands, each gated by the existing `CommandAccessPolicy`:
- `/model <name>` — switch the LLM for this session (session-scoped override)
- `/usage` — show tokens used and estimated cost for the session
- `/compress` — summarise earlier turns to free context window
- `/queue <text>` — enqueue a follow-up to run after the current turn
- (and `/retry`, `/undo` where transcript state allows)

## Layer placement
- **Primary layer:** wrapper — these are gateway/bot control-surface commands wired into the `CommandRegistry`.
- **Why not core:** the underlying operations (compaction, usage, model override) belong in core and already exist; the *chat command surface* is a wrapper concern.
- **Why not tools:** these are operator controls over the session, not agent-callable task integrations.
- **Why not plugins:** they are first-class, always-available bot UX, not optional cross-cutting policy.
- **Secondary touch (optional):** core — expose small `Session.usage_snapshot()` and `Session.compact()` conveniences so the wrapper handlers stay thin.
- **3-way surface (CLI + YAML + Python):** yes — the commands live in chat; YAML `CommandAccessPolicy` gates them; Python registers/overrides handlers via `CommandRegistry`.

## Proposed approach
- Extension point: add built-in handlers in `bots/_commands.py` that call existing core APIs; keep them opt-out via config.
- Minimal API sketch:
```python
@bot.on_command("model")
async def _model(ctx, name):
    ctx.session.set_model(name)          # session-scoped override
    return f"Model switched to {name} for this conversation."

@bot.on_command("usage")
async def _usage(ctx):
    u = ctx.session.usage_snapshot()     # tokens + cost from hooks
    return f"Tokens: {u.total} · Est. cost: ${u.cost:.4f}"
```

## Resolution sketch
```text
# Before (today)
/model gpt-4o   -> "Unknown command"
/usage          -> "Unknown command"
/compress       -> "Unknown command"

# After (proposed)
/model gpt-4o   -> "Model switched to gpt-4o for this conversation."
/usage          -> "Tokens: 18,204 · Est. cost: $0.0431"
/compress       -> "Summarised 42 earlier messages; ~12k tokens freed."
```

## Severity
**High** — for a long-running, multi-user gateway these controls are table stakes; their absence forces users to start a new session (`/new`) and lose context simply to change model or recover from context-window overflow.

## Validation
Read `bots/_commands.py` (`_initialize_builtin_commands` registers exactly five commands; `CommandAccessPolicy` already present). Confirmed via search that no `/model`, `/usage`, `/cost`, `/compress` or `/queue` handler exists anywhere under `bots/`. Verified the enabling primitives already exist in core (`context/optimizer.py`, `context/manager.py`, `hooks/types.py` LLM token hooks, per-call model override on `Agent.chat`).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Gateway bots lack a runtime control surface — no in-chat /model, /usage, /compress or /queue commands #2158

Summary

Current behaviour

Desired behaviour

Layer placement

Proposed approach

Resolution sketch

Severity

Validation

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Gateway bots lack a runtime control surface — no in-chat /model, /usage, /compress or /queue commands #2158

Description

Summary

Current behaviour

Desired behaviour

Layer placement

Proposed approach

Resolution sketch

Severity

Validation

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions