feat: add agent-friendly Modly CLI#144
Conversation
|
I think the idea is useful: Modly does need an agent-friendly, JSON-first way to automate common operations without driving the UI manually. After reviewing this against the current dev branch and the existing external MCP/CLI direction, my main concern is not the usefulness of the tool, but the contract it creates. Right now this PR adds a new Python CLI surface that talks mostly to existing REST endpoints such as:
That is practical, but it also risks becoming a third parallel automation path next to:
I would suggest keeping the useful CLI ergonomics, but aligning the contract before treating this as an official automation interface. Concretely, I think the CLI should be shaped around canonical primitives: health
model
workflow-run
process-run
capabilityAnd keep the current /generate/* flow as legacy compatibility rather than the primary path. 1. Prefer workflow runs for generationInstead of making generate primarily wrap: POST /generate/from-image
GET /generate/status/{job_id}prefer: POST /workflow-runs/from-image
GET /workflow-runs/{run_id}
POST /workflow-runs/{run_id}/cancelThe JSON output should include recovery metadata so agents can resume/poll safely: {
"ok": true,
"run": {
"kind": "workflowRun",
"id": "..."
},
"meta": {
"status_command": "workflow-run status ...",
"legacy": false
}
}generate can still exist as a friendly alias, but it should be clearly documented as either canonical-through-workflow-run or legacy if it falls back to /generate/*. 2. Use process/capability discovery instead of heuristicsThe current approach appears to infer some behavior from model names / labels, especially around texture/refine-style flows. For agents, that is fragile. The CLI should avoid guessing hidden capabilities from strings and instead use explicit discovery when available, e.g.: GET /automation/capabilities
POST /process-runs
GET /process-runs/{run_id}If a process is not available through a canonical contract, it should fail closed: {
"ok": false,
"code": "UNSUPPORTED_PROCESS",
"message": "This process is not available through the canonical process-run contract."
}3. Validate model IDs canonicallyAny model_id should be validated against: GET /model/allIf --model auto remains, it should use verified defaults/capability metadata, not string matching or first available model behavior. 4. Clarify readiness boundariesGET /health should happen before business operations. Also, starting FastAPI through serve / ensure-server should be clearly marked as developer-only. A FastAPI-only process does not imply Electron/Desktop bridge readiness, scene operations, extension process execution, or full workflow support. I would suggest moving these commands under something like: dev serve-api
dev ensure-serveror documenting them explicitly as non-canonical dev helpers. 5. Keep ComfyUI out of the canonical Modly agent contractThe ComfyUI integration may be useful, but it feels like an external orchestration helper rather than part of the core Modly automation contract. I would suggest moving it under: experimental comfy-imageor splitting it from the main supported CLI surface. 6. Align with existing MCP directionSettings > Agent already points users toward modly-cli-mcp / modly-mcp for external agents. To avoid confusing users and future maintainers, this CLI should either:
The important part is avoiding a second canonical-looking automation contract with different semantics. 7. PR hygieneSince the feature is Python-only, the package-lock.json changes look unrelated unless there is a specific reason for them. I would recommend removing that noise from this PR. Overall: I like the goal and I think the CLI can be valuable. I would just prefer to land it with a clearer contract boundary:
That would preserve the useful agent ergonomics while keeping Modly automation aligned with Agent Chat, MCP, and future workflow/process execution. |
Adds a simple CLI tool so AI agents can have an easier time interacting with modly. Includes a SKILL.md file as well.