You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Manage existing PawWork Automations from inside the conversation. Today the model can create an Automation (automate tool, routing fixed in #1244) but cannot see, pause, resume, or delete one. When a user says "show my scheduled tasks", "pause that reminder", or "delete the daily report job", the model has no tool to act with — the user must leave the conversation and operate the Automations panel manually.
Which area would this change affect?
Model harness, prompts, tools, or session mechanics
What do you do today?
The model either tells the user to open the Automations panel, or worse (weak models) reaches for OS-level cleanup such as crontab -r. Every management action after creation is manual UI work, which breaks PawWork's product boundary that conversation is the core surface for non-technical users.
What would a good result look like?
A new automate_manage tool with a flat, weak-model-friendly schema:
automate_manage {
action: "list" | "pause" | "resume" | "delete",
id?: string // not required for list
}
Ratified design decisions:
Verb set: list | pause | resume | delete. No edit in v1 — changing cron/prompt is delete-and-recreate (the model can do it) or a panel action. All-scalar fields, zero nesting, consistent with the flat-schema rule in automate.ts.
delete goes to the model, gated by confirmation: delete must pass through the existing ctx.ask permission pipeline (user-visible approve/deny). No silent deletion. pause/resume are reversible and need no gate.
Precise addressing: delete/pause/resume require an exact id taken from list output or the automate creation result — no fuzzy matching, no bulk delete.
Deferred tool: register automate_manage behind tool_info (name + card resident, schema on activation). Creation intent stays routed by the resident automate tool; management is lower-frequency and discoverable via the deferred card. This mirrors Claude Code (CronCreate/Delete/List deferred) and Perplexity (schedule tooling behind a skill load).
Handle the existing Automation.remove(id) 409 path (active run in another process) with a clear tool error the model can relay.
What would count as done?
In a conversation, "what scheduled tasks do I have?" → model calls automate_manage {action:"list"} and reports the automations with their schedules and ids.
"Pause the daily report" → model lists if needed, then pause with the exact id; the Automations panel reflects the paused state.
"Delete that reminder" → a permission prompt appears via ctx.ask; the automation is removed only after the user approves; on deny, the model reports the denial and does not retry or fall back to OS tools.
Deleting with a stale/unknown id returns a clean error, not a crash; a 409 (run in flight) is surfaced as a readable message.
Routing check: a weak model asked to remove or pause a scheduled task reaches automate_manage (via the deferred card), and never touches crontab, launchd, or schtasks.
edit/update verb (change cron, prompt, title in place) — v2 candidate if delete-and-recreate proves clumsy in practice.
Any Automations panel UI changes.
Bulk operations (delete all, pause all).
Cross-session/global automation administration beyond what Automation.list already exposes.
Which audience does this matter to most?
Both
Extra context
Decision is backed by an adversarially-verified survey of 10 products' model-facing scheduling surfaces (8 researched via source code / leaked prompts / docs, plus firsthand introspection of Claude Code and Codex Desktop):
Giving the model delete is the majority position (7 of 9 decided products): Claude Code, Codex Desktop, OpenClaw, Gemini, Devin, Perplexity Computer, Grok Build. Only ChatGPT Tasks and M365 Copilot withhold it.
Nobody ships delete bare: Claude Code gates it behind harness permissions, Codex Desktop forces suggested_* user-confirmation modes in risky contexts, Perplexity forces a confirm_action before create/update, Grok forbids autonomous cancellation, Gemini requires exact-id addressing via a prior list call. ctx.ask is PawWork's existing equivalent — no new mechanism needed.
Pause is generally not a standalone verb in the industry (toggled via update, or left in UI); PawWork keeps pause/resume as explicit verbs only because there is no edit in v1 and they are the cheapest reversible actions.
ChatGPT is the cautionary tale for withholding delete: its system prompt literally tells the model to send users to the UI ("you'll need to delete one"), creating a capability cliff mid-conversation.
Related: #1244 (automate routing fix, merged as ae981f0).
What task are you trying to do?
Manage existing PawWork Automations from inside the conversation. Today the model can create an Automation (
automatetool, routing fixed in #1244) but cannot see, pause, resume, or delete one. When a user says "show my scheduled tasks", "pause that reminder", or "delete the daily report job", the model has no tool to act with — the user must leave the conversation and operate the Automations panel manually.Which area would this change affect?
Model harness, prompts, tools, or session mechanics
What do you do today?
The model either tells the user to open the Automations panel, or worse (weak models) reaches for OS-level cleanup such as
crontab -r. Every management action after creation is manual UI work, which breaks PawWork's product boundary that conversation is the core surface for non-technical users.What would a good result look like?
A new
automate_managetool with a flat, weak-model-friendly schema:Ratified design decisions:
list | pause | resume | delete. Noeditin v1 — changing cron/prompt is delete-and-recreate (the model can do it) or a panel action. All-scalar fields, zero nesting, consistent with the flat-schema rule inautomate.ts.deletemust pass through the existingctx.askpermission pipeline (user-visible approve/deny). No silent deletion.pause/resumeare reversible and need no gate.delete/pause/resumerequire an exactidtaken fromlistoutput or theautomatecreation result — no fuzzy matching, no bulk delete.automate_managebehindtool_info(name + card resident, schema on activation). Creation intent stays routed by the residentautomatetool; management is lower-frequency and discoverable via the deferred card. This mirrors Claude Code (CronCreate/Delete/List deferred) and Perplexity (schedule tooling behind a skill load).Automation.remove(id)409 path (active run in another process) with a clear tool error the model can relay.What would count as done?
automate_manage {action:"list"}and reports the automations with their schedules and ids.pausewith the exact id; the Automations panel reflects the paused state.ctx.ask; the automation is removed only after the user approves; on deny, the model reports the denial and does not retry or fall back to OS tools.automate_manage(via the deferred card), and never touchescrontab,launchd, orschtasks.test/tool/automate.test.ts).What should stay out of scope?
edit/updateverb (change cron, prompt, title in place) — v2 candidate if delete-and-recreate proves clumsy in practice.Automation.listalready exposes.Which audience does this matter to most?
Both
Extra context
Decision is backed by an adversarially-verified survey of 10 products' model-facing scheduling surfaces (8 researched via source code / leaked prompts / docs, plus firsthand introspection of Claude Code and Codex Desktop):
suggested_*user-confirmation modes in risky contexts, Perplexity forces aconfirm_actionbefore create/update, Grok forbids autonomous cancellation, Gemini requires exact-id addressing via a prior list call.ctx.askis PawWork's existing equivalent — no new mechanism needed.pause/resumeas explicit verbs only because there is noeditin v1 and they are the cheapest reversible actions.Related: #1244 (automate routing fix, merged as ae981f0).