User Archon DB
──── ────── ──
runs /workflow run X "A" ───────▶ findResumable... → null
dispatch fresh ───▶ run-A row → status='failed'
(parse-args fails on input "A")
runs /workflow run X "B" ───────▶ findResumable... → run-A
[X] auto-resume in run-A's worktree
with run-A's persisted state
executeWorkflow(
working_path=thread-run-A,
userMessage="B"
↑ scripts ignore: they read
$ARTIFACTS_DIR/.X from run-A
) ───▶ run-A re-executed,
still on task A
sees positive report ◀────────── task-A success report
"I asked for B" (still no idea task B was hijacked)
{"level":30,"module":"command-handler","workflow":"ztech-marimo-edit",
"args":"fortigapminder.marimo.py Remove redundant local tomllib re-imports
from cells 4, 7, 12 and 13", ← user's correct args
"msg":"cmd.workflow_starting"}
{"level":30,"module":"orchestrator-agent",
"workflowName":"ztech-marimo-edit",
"resumableRunId":"92d86ea89fd6808c5f6534b4ef34acbc", ← prior failed run
"workingPath":"/.archon/.../worktrees/archon/thread-85a590f9",
"msg":"orchestrator.foreground_resume_detected"}
{"level":30,"module":"workflow.dag-executor",
"priorCompletedCount":5,
"msg":"dag.workflow_resume_prepopulated"} ← old state restored
{"level":50,"module":"workflow.dag-executor","exitCode":1,
"stderrTail":"ERROR: First argument must be a notebook path ending in .py
[...] INPUT (arg $1)='Edit the notebook at fortigapminder...'",
↑ THE OLD reformulated user_message,
not the new args
"msg":"dag_node_failed"}
Summary
/workflow run X "task B"silently auto-resumes a prior failed run ofXin the same chat, executing in the failed run's sub-worktree with the failed run's persisteduser_message("task A"). The new prompt is discarded with no UI/log indication. The user sees a positive completion report on task A and is confused why task B never happened. Compounding:/workflow abandonrejects failed runs as "already terminal", so users hit by this cannot easily escape.fix: foreground resume for interactive workflows + chat auto-resume) which addedfindResumableRunByParentConversationwithstatus IN ('failed', 'paused'). The'failed'clause was scoped to support manual/workflow resume <id>; using it for automatic resume on a fresh/workflow runproduces the silent-hijack behavior.major(silent data loss / silent intent loss; trust-corroding)Steps to Reproduce
$ARTIFACTS_DIR/.Xfiles (most non-trivial workflows do this — e.g.parse-argsstyle scripts).failedinremote_agent_workflow_runs,working_path = .../worktrees/archon/thread-<old-id>/.user_message(preserved in$ARTIFACTS_DIR/.Xfiles). Step 4's input is never used.Expected vs Actual
/workflow runwith new args dispatches a fresh run in a fresh worktree with the new args. The prior failed run remains as an audit-trail row but does not steer execution. If the user wants to continue the failed run from where it stopped, they explicitly type/workflow resume <id>.failed | pausedresumable run for the same(workflow_name, parent_conversation_id), callsexecuteWorkflowwith the failed run'sworking_path, and the workflow re-reads stale state from disk. The new args travel through the call asuserMessagebut are discarded by parse-args/script-style early nodes.User Flow
The
[X]is where intent silently disappears.Environment
Logs
The fresh
/workflow runtyped the correct path-prefixed args, but the resumed run reads the old natural-language reformulation from.edit-descriptionartifact persisted by run-A.Impact
$ARTIFACTS_DIR/.Xfiles (most non-trivial DAG workflows).archon-fix-issue,archon-feature-development, custom user workflows, etc./workflow abandonrejected failed runs as terminal;/workflow resume <id>re-ran the same stale state; the only way out was direct DB manipulation (UPDATE remote_agent_workflow_runs SET status='cancelled' WHERE id=...).Scope
corecore:orchestrator(dispatch logic),core:db(findResumableRunByParentConversation),core:operations(abandonWorkflow)