Skip to content

CLI: add a model catalogue and model-selection UX (praisonai models + model-string validation) #2123

Description

@MervinPraison

Summary

The CLI has no way to discover which models are available, their capabilities (tool-calling, vision, reasoning), context-window limits, or pricing — and it does not validate a --model / YAML llm: string before use. New and daily users must guess provider/model identifiers, and a typo or unsupported model only surfaces as a late, opaque runtime error from the underlying client. This is a core "few lines to first success" gap on the CLI-first path.

Current behaviour

  • There is no models command. The 50+ command tree (src/praisonai/praisonai/cli/app.py) exposes config, auth, setup, run, etc., but nothing to list or describe models.
  • Model strings are passed straight through to the endpoint resolver (cli/commands/run.pyllm/credentials.py:resolve_llm_endpoint_with_credentialsllm/env.py:resolve_llm_endpoint) with no validation, no capability lookup, and no "did you mean …" suggestion.
  • The no-argument TUI hardcodes a default (AsyncTUIConfig(model="gpt-4o-mini") in cli/app.py) with no way for the user to see alternatives or their limits.
  • Result: users cannot answer basic questions from the terminal — "which models can I use?", "does this model support tools/vision?", "what is its context limit and cost?" — and a wrong model id fails late.

Desired behaviour

  • praisonai models lists available models grouped by provider, showing capability flags (tool-calling, vision, reasoning), context/output limits and (where known) cost. Support --provider <name>, --json, and a filter argument.
  • praisonai models describe <provider/model> prints full metadata for one model.
  • Model-string resolution validates the requested id against the catalogue and, on a miss, raises a clear error with close-match suggestions (reuse the existing difflib.get_close_matches pattern already used for field typos in agents_generator.py).
  • Catalogue data is sourced from metadata already available via the optional litellm dependency (model cost/context maps), cached locally with a short TTL, and degrades gracefully when offline or when litellm is not installed.

Layer placement

  • Primary layer: wrapper (src/praisonai/praisonai/)
  • Why not core: catalogue fetching, caching, Rich-table presentation and CLI wiring are heavy I/O + UI concerns; the protocol-driven core must stay lightweight. Core already holds the model router; it should not own a user-facing catalogue command.
  • Why not tools: this is not an agent-callable integration — agents do not call "list models" as a task tool; it is operator/developer tooling.
  • Why not plugins: it is not a lifecycle hook, guardrail or policy.
  • Secondary touch (optional): a tiny ModelInfoProtocol in core could standardise the metadata shape, with the wrapper providing the litellm-backed adapter.
  • 3-way surface (CLI + YAML + Python): yes — CLI praisonai models; YAML llm:/model: values validated against the catalogue at load; Python helper (e.g. praisonai.models.list_models() / describe_model()).

Proposed approach

  1. Add cli/commands/models.py (Typer/Click group) registered in the lazy command table in cli/app.py.
  2. Add a wrapper-side catalogue module that reads litellm's cost/context metadata, normalises it, and caches to ~/.praison/cache/models.json with a TTL; offline/missing-dep fallback returns a curated static subset.
  3. Add a validate_model(model: str) helper used by run/TUI/agents_generator resolution that raises a clear error with suggestions on miss.
  4. Surface capability flags so YAML validation (see related validation work) can warn when, e.g., tools are configured for a non-tool-calling model.

Resolution sketch

cli/commands/models.py        # `models list|describe` Typer group
llm/catalogue.py (wrapper)    # litellm-backed, cached, offline fallback
llm/env.py / credentials.py   # call validate_model() before resolving
cli/app.py                    # register "models" in _LAZY_COMMANDS

Severity

Medium — not blocking, but a daily-use and first-success gap that makes model selection guesswork and pushes avoidable failures to runtime.

Validation

  • praisonai models lists models with capabilities/limits; --json machine-readable.
  • praisonai run "hi" --model gpt-5o-typo fails fast with a suggestion, not a late client error.
  • Real agentic test: agent.start("real task") on a catalogue-validated model still calls the LLM and succeeds; an unknown model id is rejected before any network call.
  • Works with and without litellm installed (graceful fallback), and offline (cached).

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingclaudeAuto-trigger Claude analysis

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions