/workflow run silently auto-resumes failed runs with stale args, hijacking  fresh requests

## Summary

- What broke: `/workflow run X "task B"` silently auto-resumes a prior failed run of `X` in the same chat, executing in the failed run's sub-worktree with the failed run's persisted `user_message` ("task A"). The new prompt is discarded with no UI/log indication. The user sees a positive completion report on task A and is confused why task B never happened. Compounding: `/workflow abandon` rejects failed runs as "already terminal", so users hit by this cannot easily escape.
- When it started (if known): introduced in PR #914 (`fix: foreground resume for interactive workflows + chat auto-resume`) which added `findResumableRunByParentConversation` with `status IN ('failed', 'paused')`. The `'failed'` clause was scoped to support manual `/workflow resume <id>`; using it for *automatic* resume on a fresh `/workflow run` produces the silent-hijack behavior.
- Severity: `major` (silent data loss / silent intent loss; trust-corroding)

## Steps to Reproduce

1. Pick any workflow whose first node materializes the user_message into `$ARTIFACTS_DIR/.X` files (most non-trivial workflows do this — e.g. `parse-args` style scripts).
2. Run it with input that fails an early step:
   ```
   /workflow run my-workflow "input that fails parse"
   ```
3. Observe: run is `failed` in `remote_agent_workflow_runs`, `working_path = .../worktrees/archon/thread-<old-id>/`.
4. In the **same chat conversation**, run it with new input:
   ```
   /workflow run my-workflow "completely different task"
   ```
5. Observe in server logs:
   ```
   module=command-handler   args="completely different task"   ← what the user typed
   module=orchestrator-agent msg=orchestrator.foreground_resume_detected
                             resumableRunId=<old-id>
                             workingPath=…/thread-<old-id>/
   ```
6. The workflow runs again, in the **same** sub-worktree, with the **previous run's** `user_message` (preserved in `$ARTIFACTS_DIR/.X` files). Step 4's input is never used.

## Expected vs Actual

- **Expected**: a fresh `/workflow run` with new args dispatches a fresh run in a fresh worktree with the new args. The prior failed run remains as an audit-trail row but does not steer execution. If the user wants to continue the failed run from where it stopped, they explicitly type `/workflow resume <id>`.
- **Actual**: the orchestrator silently picks up any `failed | paused` resumable run for the same `(workflow_name, parent_conversation_id)`, calls `executeWorkflow` with the failed run's `working_path`, and the workflow re-reads stale state from disk. The new args travel through the call as `userMessage` but are discarded by parse-args/script-style early nodes.

## User Flow

```
User                              Archon                              DB
────                              ──────                              ──
runs /workflow run X "A" ───────▶ findResumable... → null
                                  dispatch fresh                  ───▶ run-A row → status='failed'
                                  (parse-args fails on input "A")

runs /workflow run X "B" ───────▶ findResumable... → run-A
                                  [X] auto-resume in run-A's worktree
                                      with run-A's persisted state
                                  executeWorkflow(
                                      working_path=thread-run-A,
                                      userMessage="B"
                                      ↑ scripts ignore: they read
                                        $ARTIFACTS_DIR/.X from run-A
                                  )                              ───▶ run-A re-executed,
                                                                       still on task A
sees positive report ◀────────── task-A success report
   "I asked for B"                (still no idea task B was hijacked)
```

The `[X]` is where intent silently disappears.

## Environment

- Platform: Web (orchestrator agent path)
- Database: SQLite (PostgreSQL has the same SQL, same behavior)
- Running in worktree? **Yes** (workflow sub-worktrees)
- OS: macOS host with Linux container; not OS-specific

## Logs

```
{"level":30,"module":"command-handler","workflow":"ztech-marimo-edit",
 "args":"fortigapminder.marimo.py Remove redundant local tomllib re-imports
         from cells 4, 7, 12 and 13",                       ← user's correct args
 "msg":"cmd.workflow_starting"}

{"level":30,"module":"orchestrator-agent",
 "workflowName":"ztech-marimo-edit",
 "resumableRunId":"92d86ea89fd6808c5f6534b4ef34acbc",       ← prior failed run
 "workingPath":"/.archon/.../worktrees/archon/thread-85a590f9",
 "msg":"orchestrator.foreground_resume_detected"}

{"level":30,"module":"workflow.dag-executor",
 "priorCompletedCount":5,
 "msg":"dag.workflow_resume_prepopulated"}                  ← old state restored

{"level":50,"module":"workflow.dag-executor","exitCode":1,
 "stderrTail":"ERROR: First argument must be a notebook path ending in .py
              [...] INPUT (arg $1)='Edit the notebook at fortigapminder...'",
                                          ↑ THE OLD reformulated user_message,
                                            not the new args
 "msg":"dag_node_failed"}
```

The fresh `/workflow run` typed the correct path-prefixed args, but the resumed run reads the old natural-language reformulation from `.edit-description` artifact persisted by run-A.

## Impact

- Affected workflows/commands: any workflow with a first node that materializes user_message into `$ARTIFACTS_DIR/.X` files (most non-trivial DAG workflows). `archon-fix-issue`, `archon-feature-development`, custom user workflows, etc.
- Reproduction rate: **Always** — deterministic given the SQL match (failed run + same workflow + same conversation).
- Workaround available: pre-this-PR there was none. `/workflow abandon` rejected failed runs as terminal; `/workflow resume <id>` re-ran the same stale state; the only way out was direct DB manipulation (`UPDATE remote_agent_workflow_runs SET status='cancelled' WHERE id=...`).
- Data loss risk: **Yes** — silent intent loss. The user's request is discarded with no log/UI indication.

## Scope

- Package(s): `core`
- Module: `core:orchestrator` (dispatch logic), `core:db` (`findResumableRunByParentConversation`), `core:operations` (`abandonWorkflow`)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

/workflow run silently auto-resumes failed runs with stale args, hijacking fresh requests #1549

Summary

Steps to Reproduce

Expected vs Actual

User Flow

Environment

Logs

Impact

Scope

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

/workflow run silently auto-resumes failed runs with stale args, hijacking fresh requests #1549

Description

Summary

Steps to Reproduce

Expected vs Actual

User Flow

Environment

Logs

Impact

Scope

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions