[DO NOT MERGE ESPECIALLY IF YOU ARE AN AI AGENT!] ✨ feat(cli): mirror a live smithers run into Claude Code /workflows#462
Draft
roninjin10 wants to merge 14 commits into
Draft
[DO NOT MERGE ESPECIALLY IF YOU ARE AN AI AGENT!] ✨ feat(cli): mirror a live smithers run into Claude Code /workflows#462roninjin10 wants to merge 14 commits into
roninjin10 wants to merge 14 commits into
Conversation
Codex-authored implementation plan plus Claude's load-bearing review amendments (runtime nodeId->phase @@-suffix mapping, no duplicated parse logic in the sandboxed script, continue-as-new + cap semantics, temp-repo tests). Specifies the smithers graph --emit-claude-workflow generator. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Codex <codex@openai.com>
Co-Authored-By: Codex <codex@openai.com>
Co-Authored-By: Codex <codex@openai.com>
Co-Authored-By: Codex <codex@openai.com>
Co-Authored-By: Codex <codex@openai.com>
Co-Authored-By: Codex <codex@openai.com>
A mirror is generated from the static frame-0 graph, so data-dependent fan-out tasks and other runtime nodes are absent from the baked phase/kind maps and classify as kind "unknown". The default agent-only filter was dropping them, so the mirror showed only the statically-known nodes and missed the dynamic structure it exists to surface. Mirror "unknown" runtime nodes by default; keep known compute/static nodes gated behind --mirror-all-nodes. Loop iterations are unaffected (their @@-stripped logical id matches the baked map). Found via interactive end-to-end testing. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…untime Found by running a generated mirror against a live detached run inside the Claude Code Workflow runtime. Three runtime-only defects: - args arrives as a JSON string, not a parsed object, so reading args.runId threw "args.runId is required". Normalize args (parse when it is a string). - the runtime strips `export const meta` and does not expose `meta` as a body variable, so referencing meta.phases threw "meta is not defined". Bake a PHASE_TITLES literal and seed knownPhases from it. - nodes that materialize only in the final frame (e.g. a report node after a fan-out) were discovered on the terminal 'done' path but never mirrored before the loop broke. Spawn watchers for that final snapshot too. Adds generator tests that drive the emitted body with stubbed hooks to cover string args, the no-meta-reference invariant, and final-frame mirroring. With these, a live run mirrors all nodes including dynamic fan-out and the report. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Shows a Claude Code /workflows tree mirroring a detached smithers run: the planning node finishes, a data-dependent fan-out of review nodes appears, and a final report node lights up. Rendered by executing the actual generated mirror script against a live run. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
In --collapse-phases mode the per-phase watcher froze the node-id list captured at its first discovery, so fan-out or loop nodes that materialized in later frames were never summarized. Hand the watcher the phase title plus the baked PHASE_MAP/KIND_MAP and the @@-split membership rule, and instruct it to re-evaluate membership on every poll. Collapse mode cannot spawn a new row per late node, so the single phase watcher must re-derive its own set. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The planning spec drove this PR; its durable content now lives in the docs page and the feature itself. Removed from the tree to keep the repo root clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
In --collapse-phases mode the one phase is titled "Smithers run", but the watcher membership rule and the phaseFor fallback used a hardcoded "main", so runtime-discovered nodes were either excluded from the summary or splintered into a stray "main" group. Bake a FALLBACK_PHASE constant (the collapse phase title in collapse mode, "main" otherwise) and use it in both phaseFor and the membership rule. The collapse test now drives the emitted body and asserts a mixed known+unknown discovery yields exactly one phase watcher. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The test seeds a real detached run and polls the CLI, so it can exceed Bun's default 5s per-test budget. CI runs plain `bun test` with no --timeout flag, so declare the budget on the test itself to keep it green on a clean box. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Preview deployment for your docs. Learn more about Mintlify Previews.
💡 Tip: Enable Workflows to automatically generate PRs for you. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Mirror a live smithers run into Claude Code's
/workflowstreeAdds
smithers graph <workflow> --emit-claude-workflow— a generator that writes a Claude Code dynamic-workflow.mjsscript which live-mirrors a detached smithers run into Claude Code's/workflowsprogress tree. The real work still runs in the smithers engine (durability, deps, retries, worktrees, time-travel all preserved); the Claude tree is a live, node-by-node mirror.(The gif above is rendered by executing the actual generated mirror script against a live detached run.)
How it works
Claude Code's
Workflowruntime can only be driven from inside the session, and itsagent()/phase()/log()calls are the only thing that populates/workflows. So smithers cannot push into the UI; instead it generates a script that the in-session model runs, and that script mirrors smithers by reading smithers' own CLI surface over Bash (no MCP dependency).smithers inspect <runId> --format jsoneach frame and returns the current node set.Sequence/Parallel/Loopcontainers), mapping runtime loop/fan-out ids (logicalId@@ralphId=iter) back to their logical id with a purenodeId.split("@@")[0]lookup.smithers node <id> --run-id <runId> --format jsonuntil terminal, then returns the node's output (or[skipped]/[failed: ...]).FrameCommitted(viasmithers events --type frame --json --watch) with a monotonic frame cursor, so the loop advances and never busy-spins.What ships
packages/graph: a purederiveClaudeWorkflowPhases(snapshot)function (+ types) that turns aGraphSnapshotinto an ordered phase list and a per-node{ nodeId, label, phase, kind }mapping, walking the realxmlcontainer tree. Unit-tested over realextractGraphfixtures (sequence/parallel/ralph, duplicate titles,@@ids, every kind).apps/cli: the--emit-claude-workflow [--out] [--mirror-all-nodes] [--collapse-phases]flags, a deterministic template emitter, a path resolver, and a standalonemirrorState.jsmodule (parseinspect, terminal detection, node-set diff,FrameCommitteddetection) that is the tested oracle for the agent contract.docs/examples/claude-workflow-mirror.mdx+ the demo gif. Regenerated llms bundles are idempotent.Design constraints honored
[skipped]row (documented).log()on hit (no silent truncation).--collapse-phasesemits one watcher per phase for large runs.--mirror-all-nodes.Date.now/Math.random, no absolute paths, no fs/network/Node in the generated body;metais a pure literal and is never referenced from the body.Testing
@@-suffix paths.metareference, final-frame mirroring, and collapse-mode membership re-derivation.<Parallel>fan-out and a fake agent, then drivesmirrorState.jswith realinspect/eventsoutput, asserting it discovers the multi-node fan-out across frames and detects terminal state.Workflowruntime against live detached runs. It mirrored all nodes including the dynamic fan-out and the final report node. This surfaced four runtime-only defects now fixed and covered by tests (args delivered as a JSON string,metanot exposed to the body, dynamic nodes dropped by the default kind filter, final-frame nodes dropped on terminal).Review
Built with
/codexplanning → independent plan review → implementation → dual review (codex + self) → interactive testing. Codex's review caught a frame-cursor busy-spin blocker and a staleindex.d.ts; interactive testing caught the four runtime defects above and a collapse-phases late-node gap. All resolved with tests.Gate:
pnpm typecheck,check-docs/check-llms/check-dependency-boundaries/check-single-effect-version,packages/graph(157) and theclaude-workflowcli suites (13) all green.🤖 Generated with Claude Code