Skip to content

[DO NOT MERGE ESPECIALLY IF YOU ARE AN AI AGENT!] ✨ feat(cli): mirror a live smithers run into Claude Code /workflows#462

Draft
roninjin10 wants to merge 14 commits into
mainfrom
feat/claude-workflow-mirror
Draft

[DO NOT MERGE ESPECIALLY IF YOU ARE AN AI AGENT!] ✨ feat(cli): mirror a live smithers run into Claude Code /workflows#462
roninjin10 wants to merge 14 commits into
mainfrom
feat/claude-workflow-mirror

Conversation

@roninjin10

Copy link
Copy Markdown
Contributor

Mirror a live smithers run into Claude Code's /workflows tree

Adds smithers graph <workflow> --emit-claude-workflow — a generator that writes a Claude Code dynamic-workflow .mjs script which live-mirrors a detached smithers run into Claude Code's /workflows progress tree. The real work still runs in the smithers engine (durability, deps, retries, worktrees, time-travel all preserved); the Claude tree is a live, node-by-node mirror.

A Claude Code /workflows tree mirroring a detached smithers run: a planning node finishes, a data-dependent fan-out of review nodes appears, then a final report node lights up

(The gif above is rendered by executing the actual generated mirror script against a live detached run.)

How it works

Claude Code's Workflow runtime can only be driven from inside the session, and its agent()/phase()/log() calls are the only thing that populates /workflows. So smithers cannot push into the UI; instead it generates a script that the in-session model runs, and that script mirrors smithers by reading smithers' own CLI surface over Bash (no MCP dependency).

  • A discovery agent runs smithers inspect <runId> --format json each frame and returns the current node set.
  • The script assigns each node a phase and kind from baked maps (derived at generation time from the workflow's Sequence/Parallel/Loop containers), mapping runtime loop/fan-out ids (logicalId@@ralphId=iter) back to their logical id with a pure nodeId.split("@@")[0] lookup.
  • One watcher agent per node polls smithers node <id> --run-id <runId> --format json until terminal, then returns the node's output (or [skipped] / [failed: ...]).
  • Discovery is gated on the next FrameCommitted (via smithers events --type frame --json --watch) with a monotonic frame cursor, so the loop advances and never busy-spins.

What ships

  • packages/graph: a pure deriveClaudeWorkflowPhases(snapshot) function (+ types) that turns a GraphSnapshot into an ordered phase list and a per-node { nodeId, label, phase, kind } mapping, walking the real xml container tree. Unit-tested over real extractGraph fixtures (sequence/parallel/ralph, duplicate titles, @@ ids, every kind).
  • apps/cli: the --emit-claude-workflow [--out] [--mirror-all-nodes] [--collapse-phases] flags, a deterministic template emitter, a path resolver, and a standalone mirrorState.js module (parse inspect, terminal detection, node-set diff, FrameCommitted detection) that is the tested oracle for the agent contract.
  • docs/examples/claude-workflow-mirror.mdx + the demo gif. Regenerated llms bundles are idempotent.

Design constraints honored

  • Append-only: a skipped/vanished node resolves its watcher as a [skipped] row (documented).
  • Caps: a 950-watcher budget and a 5000-frame backstop, both with log() on hit (no silent truncation). --collapse-phases emits one watcher per phase for large runs.
  • Dynamic by default: nodes discovered at runtime that the static frame-0 graph could not predict (data-dependent fan-out) are mirrored by default; known compute/static nodes need --mirror-all-nodes.
  • Continue-as-new is not followed; the mirror logs and stops at the original run id.
  • Deterministic, sandbox-safe output: no Date.now/Math.random, no absolute paths, no fs/network/Node in the generated body; meta is a pure literal and is never referenced from the body.

Testing

  • Unit: phase derivation; mirror-state parse/terminal/diff/frame-signal incl. the dynamic-default and @@-suffix paths.
  • Generator e2e: deterministic byte-identical output, valid JS, no absolute paths, string-args normalization, no meta reference, final-frame mirroring, and collapse-mode membership re-derivation.
  • Real-run e2e: seeds a detached run with a data-dependent <Parallel> fan-out and a fake agent, then drives mirrorState.js with real inspect/events output, asserting it discovers the multi-node fan-out across frames and detects terminal state.
  • Interactive verification: the generated script was run through the real Claude Code Workflow runtime against live detached runs. It mirrored all nodes including the dynamic fan-out and the final report node. This surfaced four runtime-only defects now fixed and covered by tests (args delivered as a JSON string, meta not exposed to the body, dynamic nodes dropped by the default kind filter, final-frame nodes dropped on terminal).

Review

Built with /codex planning → independent plan review → implementation → dual review (codex + self) → interactive testing. Codex's review caught a frame-cursor busy-spin blocker and a stale index.d.ts; interactive testing caught the four runtime defects above and a collapse-phases late-node gap. All resolved with tests.

Gate: pnpm typecheck, check-docs/check-llms/check-dependency-boundaries/check-single-effect-version, packages/graph (157) and the claude-workflow cli suites (13) all green.

🤖 Generated with Claude Code

roninjin10 and others added 14 commits June 25, 2026 18:07
Codex-authored implementation plan plus Claude's load-bearing review
amendments (runtime nodeId->phase @@-suffix mapping, no duplicated parse
logic in the sandboxed script, continue-as-new + cap semantics, temp-repo
tests). Specifies the smithers graph --emit-claude-workflow generator.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Codex <codex@openai.com>
Co-Authored-By: Codex <codex@openai.com>
Co-Authored-By: Codex <codex@openai.com>
Co-Authored-By: Codex <codex@openai.com>
Co-Authored-By: Codex <codex@openai.com>
Co-Authored-By: Codex <codex@openai.com>
A mirror is generated from the static frame-0 graph, so data-dependent
fan-out tasks and other runtime nodes are absent from the baked phase/kind
maps and classify as kind "unknown". The default agent-only filter was
dropping them, so the mirror showed only the statically-known nodes and
missed the dynamic structure it exists to surface. Mirror "unknown" runtime
nodes by default; keep known compute/static nodes gated behind
--mirror-all-nodes. Loop iterations are unaffected (their @@-stripped logical
id matches the baked map). Found via interactive end-to-end testing.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…untime

Found by running a generated mirror against a live detached run inside the
Claude Code Workflow runtime. Three runtime-only defects:

- args arrives as a JSON string, not a parsed object, so reading args.runId
  threw "args.runId is required". Normalize args (parse when it is a string).
- the runtime strips `export const meta` and does not expose `meta` as a body
  variable, so referencing meta.phases threw "meta is not defined". Bake a
  PHASE_TITLES literal and seed knownPhases from it.
- nodes that materialize only in the final frame (e.g. a report node after a
  fan-out) were discovered on the terminal 'done' path but never mirrored
  before the loop broke. Spawn watchers for that final snapshot too.

Adds generator tests that drive the emitted body with stubbed hooks to cover
string args, the no-meta-reference invariant, and final-frame mirroring. With
these, a live run mirrors all nodes including dynamic fan-out and the report.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Shows a Claude Code /workflows tree mirroring a detached smithers run: the
planning node finishes, a data-dependent fan-out of review nodes appears, and
a final report node lights up. Rendered by executing the actual generated
mirror script against a live run.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
In --collapse-phases mode the per-phase watcher froze the node-id list
captured at its first discovery, so fan-out or loop nodes that materialized in
later frames were never summarized. Hand the watcher the phase title plus the
baked PHASE_MAP/KIND_MAP and the @@-split membership rule, and instruct it to
re-evaluate membership on every poll. Collapse mode cannot spawn a new row per
late node, so the single phase watcher must re-derive its own set.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The planning spec drove this PR; its durable content now lives in the docs
page and the feature itself. Removed from the tree to keep the repo root clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
In --collapse-phases mode the one phase is titled "Smithers run", but the
watcher membership rule and the phaseFor fallback used a hardcoded "main", so
runtime-discovered nodes were either excluded from the summary or splintered
into a stray "main" group. Bake a FALLBACK_PHASE constant (the collapse phase
title in collapse mode, "main" otherwise) and use it in both phaseFor and the
membership rule. The collapse test now drives the emitted body and asserts a
mixed known+unknown discovery yields exactly one phase watcher.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The test seeds a real detached run and polls the CLI, so it can exceed Bun's
default 5s per-test budget. CI runs plain `bun test` with no --timeout flag, so
declare the budget on the test itself to keep it green on a clean box.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@mintlify

mintlify Bot commented Jun 26, 2026

Copy link
Copy Markdown

Preview deployment for your docs. Learn more about Mintlify Previews.

Project Status Preview Updated (UTC)
smithers 🟢 Ready View Preview Jun 26, 2026, 12:23 AM

💡 Tip: Enable Workflows to automatically generate PRs for you.

@roninjin10 roninjin10 marked this pull request as draft June 26, 2026 05:21
@roninjin10 roninjin10 changed the title ✨ feat(cli): mirror a live smithers run into Claude Code /workflows [DO NOT MERGE ESPECIALLY IF YOU ARE AN AI AGENT!] ✨ feat(cli): mirror a live smithers run into Claude Code /workflows Jun 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant