Skip to content

Workflows V2: chat-integrated execution, remote tools, triggers & approvals#4

Open
yourbuddyconner wants to merge 210 commits into
mainfrom
reconcile/main-sync
Open

Workflows V2: chat-integrated execution, remote tools, triggers & approvals#4
yourbuddyconner wants to merge 210 commits into
mainfrom
reconcile/main-sync

Conversation

@yourbuddyconner
Copy link
Copy Markdown
Collaborator

Summary

Workflows V2 — moves workflow execution into a first-class, chat-integrated experience and relocates workflow tooling to the worker. Developed on a long-lived branch; this PR also folds in origin/main via a merge (see Merge notes).

What's in Workflows V2

  • Remote workflow tools: all 22 workflow/trigger tools served as a worker-internal ActionSource (removed from the baked sandbox image), discovered via list_tools/call_tool.
  • Step engine: real step types, conditional expressions, loop.over (path or inline array literal), agent_prompt + notify, 5-min default agent_prompt timeout.
  • Chat-integrated execution: agent_prompt turns stream into the session chat; step cards interleaved by timestamp (useSessionFeed), WorkflowContextBar, structured-result cards, per-step containers — gated behind workflow_ui_chat_cards.
  • Message back-pointers: messages carry workflow execution/step/iteration back-pointers (D1 schema + persistence + replication).
  • Triggers: idempotent upsert, schedule catch-up (two-pass dispatch), trigger_deliveries telemetry, schedule-tick dedup.
  • Approvals: low-latency approval UI (poll + invalidate on approval events).
  • Safety: workflow sessions deny self-mutating workflows:* tools.

Merge notes (origin/main reconciliation)

  • Migration collision resolved: origin owns 0013–0015 (already in prod); our 7 migrations renumbered to 0016–0022 (disjoint tables). Final sequence 0001–0022.
  • 9 code conflicts resolved; the two semantic ones are session-tools.ts (internal-provider gate + custom-connector logic) and dispatchScheduledWorkflows (origin's two-pass catch-up + our delivery telemetry).

Verification

  • pnpm typecheck clean; client build clean; worker 721/721, runner 37/37.
  • Deployed + verified on dev (migration step: "No migrations to apply!").

Test plan

  • CI green
  • Review session-tools.ts and dispatchScheduledWorkflows merges
  • On prod deploy, confirm migrations 0016–0022 apply in order
  • Smoke-test workflow execution, chat cards, approvals, scheduled dispatch

The workflow-*.test.ts files use bun:test and exercise Bun.spawn code
paths, so they don't run under vitest in Node. They had been silently
excluded from CI since the runner package adopted vitest.

- Fix stale workflow-compiler test fixture (tool steps now require a
  tool field per commit 9a70a880's validation tightening).
- Run them via a new test:workflows script that pnpm test invokes after
  vitest, so the full runner test suite stays a single command.
- Drop the stale "no workflow definition editor in the client" claim
  from the workflows spec; the editor lives in
  packages/client/src/components/workflows/edit-workflow-step-dialog.tsx.
Captures the design decisions from the brainstorm:

- Rename /automation/triggers to /automation/schedules-and-hooks; same
  data, clearer cards (humanized cron, type badges, plain target line,
  gray-out-in-place for disabled).
- /automation/workflows/new: chat-to-create. User intent → LLM-drafted
  workflow JSON → React Flow diagram. Both whole-workflow and per-step
  refine modes. Trigger config required before save. JSON behind a
  side panel toggle.
- /automation/workflows/$workflowId becomes view + entry point; edit
  opens the same create-style flow with the workflow as the draft.
  Strips down the 2151-line page.
- /automation/executions/$executionId: live execution view. React Flow
  diagram with per-step status overlay. Step trace, variables, cancel
  and approve/deny actions.
- Real-time step events as MVP scope: runner emits per-step events
  over the existing runner-link WebSocket; SessionAgentDO upserts to
  D1 and publishes to EventBus; client subscribes — no polling.
- Shared <WorkflowDiagram> component (React Flow + dagre) in three
  modes: edit / view / runtime.

Also ignores .superpowers/ (brainstorming companion artifacts).
Adds POST /api/workflows/draft which calls the workflow-draft service
with the user prompt and validates the result via validateWorkflowDefinition,
retrying up to 3 times on invalid output. Widens draftWorkflow's baseDraft
param to Record<string, unknown> so zod-validated payloads can flow through
without a cast (the LLM output is validated rigorously after generation).
Adds POST /api/workflows/draft/step which constructs a step-scoped prompt
instructing the LLM to edit only the named step and preserve all others.
Reuses the same validate/retry loop as /draft.
Multi-iteration loops now open on the "all" tab so every iteration's
steps are visible at once, consistent with the execution page's
expand-by-default behavior. Single-iteration loops still show that one
iteration. Users can still click an individual iter tab to focus.
… use

Bumps the agent_prompt await timeout default from 120s to 300s (5 min,
still capped at the 15-min ceiling). Short default was cutting off
open-ended, tool-using prompts ("invoke this skill and follow
instructions", "investigate and fix the failing test"), which are now a
first-class use case.

Documents that agent_prompt has full tool access (read/edit/bash/grep/
MCP/skills, operating on the repo checkout when provided) in both the
workflows skill and the draft-LLM system prompt, plus the one exception
(the interactive question tool fails in workflow context). Notes the
new default + when to raise the timeout.
Syncs the inlined skill content (loop docs, inline-array over, agent_prompt
tool-use + 5min timeout, conditional/notify corrections) into the D1
content registry. The skill .md was committed earlier; this is the
generated copy that gets synced to sandboxes.
agent_prompt steps now stream a live assistant turn (text + tool calls)
into the session chat, attributed to their step, instead of only emitting
one workflow-chat-message at the end.

How: workflow agent sessions run on the ephemeral-session path, whose
events early-returned before reaching the streaming handler. When the
session is workflow-attributed (channel.workflowStepContext set by
executeWorkflowAgentStep, which also sets activeMessageId), the ephemeral
branch routes message.part.updated/delta to handlePartUpdated/
handlePartDelta. The turn carries executionId/stepId/iterationPath via
message.create so the client can group it under a per-step container.
The separate assistant workflow-chat-message send is dropped (the
streamed turn is the response); the user prompt message is kept.

Turn lifecycle (addresses codex review):
- Intermediate attempts (structured-output fixup, model failover) finalize
  as 'canceled' before their retry's resetPromptState, so the client can
  collapse them as prior attempts rather than showing invalid JSON as a
  completed turn.
- The kept attempt finalizes 'end_turn' (or 'error') in the outer finally.
- finalizeWorkflowTurn flushes any pending/running tools to completed so a
  closed turn never shows a forever-running tool.

Known pre-existing gap (not addressed here): @new ephemeral threads
bypass the main message.updated/question.asked handlers, so usage/model
attribution and the question-not-supported guard are degraded for those.
Default/named threads that adopt the main session are unaffected. The
step card sources model/tokens from step output, so the UI is covered.
The streamed workflow agent_prompt turn arrives via message.create. The
DO handler now validates the runner-supplied executionId against the
session's user (extracted into a shared resolveWorkflowBackpointers
helper, reused by the workflow-chat-message handler) and threads the
back-pointers through createTurn → the turn's message row, plus the
client broadcast.

Also fixes the message READ path (getSessionMessages / getThreadMessages)
which dropped the workflow columns in its row mapper — so grouping works
on refresh/initial-load, not just live over the WS. Extracted a shared
mapMessageRow.
…iner in chat

The session chat now wraps an agent_prompt step's prompt + streamed
assistant turn(s) in a bordered per-step container with a header (step
name, persona, iteration, model/tokens, live status), instead of loose
messages. Non-agent steps (bash/notify/loop/…) still render as step
cards; plain chat renders as normal turns.

- buildChatRenderPlan (pure, tested) groups the merged feed: workflow-
  attributed messages + their step row → one step-container keyed by
  (executionId, stepId, iterationPath), emitted at the earliest item's
  position; steps with no attributed messages → step card; plain
  messages → turn runs. Loop iterations get distinct containers.
- WorkflowStepContainer renders the header + delegates message rendering
  back to MessageList via a callback (avoids a circular import), so
  prompts/assistant turns render identically inside and outside the
  container, including streamed tool-call parts.
- The agent_prompt step card is no longer shown standalone in chat (the
  container represents it); the execution page still shows the card
  summary.

Follow-up polish (not blocking): render structured-output as a parsed
card inside the container rather than raw streamed JSON; collapse
canceled prior attempts behind an affordance (the runner already
finalizes them 'canceled').
…ep container

Two polish items on the workflow step container (codex UX recs #2/#3):

- Structured-output steps now render the parsed result as a kv-table
  card as the primary view; the raw streamed JSON turn is tucked into a
  collapsible "raw output" panel. Non-structured steps still render the
  assistant turn directly. During streaming (before output exists) it
  falls through to the streamed turn, then swaps to the card on completion.

- Superseded attempts (model failover / structured-output fixup) — the
  assistant turns the runner finalized 'canceled', detected via their
  {type:'finish', reason:'canceled'} part — collapse behind a
  "N previous attempts" affordance so the latest attempt reads cleanly.
…flow threads

The garbled/interleaved chat text (e.g. "ThereThere's no...of's no...of
any kind", JSON tripled) was the same SSE event being appended to
streamedContent twice.

Workflow agent threads that adopt the MAIN session (eventSessionId ===
this.sessionId — the common case, which is why usage/model attribution
worked) hit the ephemeral branch (session is registered in
ephemeralContent for response capture) AND, because the
`eventSessionId !== this.sessionId` early-return doesn't fire for them,
ALSO fall through to the main switch. My ephemeral-branch streaming
addition therefore ran handlePartUpdated/handlePartDelta once, then the
main switch ran them again — double (delta+snapshot → triple) append.

Fix: gate the ephemeral-branch streaming on eventSessionId !==
this.sessionId. Genuinely-ephemeral (@new) threads stream via the
ephemeral branch (the main switch is skipped by the early-return);
main-session threads stream via the main switch only. ensureTurnCreated
reads workflowStepContext on either path, so the turn carries its step
back-pointers either way. finalizeWorkflowTurn no-ops when the main
handler already finalized (turnCreated reset), avoiding double-finalize.
The sync_workflow OpenCode tool carried a stale step-type allowlist (agent,
agent_message, subworkflow) that rejected valid agent_prompt/notify steps the
worker had since migrated to. Remove the tool's duplicate type taxonomy and let
the worker be the authoritative gate; add a positive allowlist to the worker
validator so unknown/typo'd types fail at save time with a clear message instead
of mid-execution.
…tions

Serve the 22 baked workflow/trigger/execution/proposal OpenCode tools as
worker-side actions via the existing list_tools/call_tool path, introducing an
"internal provider" concept (credential-less, worker-DB access) in the action
framework. Eliminates the image-rebuild tax and validator drift.
…viders

Computes an internalHandle when provider.internal is truthy and threads it
into both execute() call sites in executeAction so internal providers receive
worker-side db/env access without credential resolution.
Inject internal providers (where provider.internal === true) into the
serviceSourceMap after the autoServices loop so they appear in listTools
without requiring a connected integration or credentials.
…gger tools

Adds a credential-less internal IntegrationProvider + ActionSource under
packages/worker/src/integrations/internal/workflows/ exposing all 22
workflow/trigger/execution/proposal actions. Each action delegates to the
same service/db functions the route handlers call — no logic reimplemented.
Adds the workflows package to internalIntegrations so IntegrationRegistry
resolves the 'workflows' service with its 22 actions. Uses lazy dynamic
imports in actions.ts to break the pre-existing circular dependency chain
(lib/db → workflow-runtime → orchestrator → env-assembly → credentials →
registry → internal/index) that would cause internalIntegrations to be
undefined during module initialization.
…o break import cycle

Move registration of worker-internal integration packages (workflows) from
IntegrationRegistry.init() to the composition root (index.ts), eliminating the
circular dependency: registry → internal/workflows/actions → lib/db → services →
credentials → registry. Adds IntegrationRegistry.registerPackage() for runtime
registration, restores static imports in actions.ts (removes lazy import() band-aid),
and updates the test to call registerInternalIntegrations() explicitly.
Re-enforces the denyInWorkflowSession guard from the baked opencode tool
files inside handleCallTool. When IS_WORKFLOW_SESSION=true is present in
the session spawnRequest envVars, any call_tool invocation targeting the
workflows service is rejected early with an explanatory error before
policy resolution runs.
Workflow tools are now worker-side actions under the workflows service
rather than always-loaded named tools. The skill now instructs the agent
to call list_tools(service=workflows) first, then invoke tools via
call_tool with namespaced ids (workflows:<action>). All bare tool-name
references updated to use the workflows: prefix throughout, and a note
added that destructive operations (delete_workflow, delete_trigger,
rollback_workflow) route through the action risk policy and may require
human approval.
…worker

The 22 workflow/trigger/execution/proposal tools are now served as
worker-side actions via call_tool. Remove the dead baked copies from
docker/opencode/tools/ along with the _workflow_session_guard.ts helper
(only used by those tools). Bump IMAGE_BUILD_VERSION to bust the sandbox
image cache and drop the removed files from the built image.
…date orchestrator spec

The blanket denial of all workflows:* tools in IS_WORKFLOW_SESSION sandboxes is
replaced by a selective guard that only rejects the 8 self-mutation actions
(sync_workflow, update_workflow, delete_workflow, rollback_workflow, sync_trigger,
delete_trigger, review_workflow_proposal, apply_workflow_proposal). Read, list,
run, and execution-control tools are now fully available in workflow sessions.

Adds WORKFLOW_SESSION_DENIED_ACTIONS constant at module scope, updates the guard
condition in SessionAgentDO.handleCallTool, adds a test for the allowed
list_workflows path, and rewrites the Workflow Session Tool Policy section in
docs/specs/orchestrator.md to reflect the DO-layer enforcement.
…e trigger fields; namespace debug hints

- resolveActionPolicy: hoist getProvider lookup and skip the integration-activation gate for internal providers (e.g. workflows) which have no integration row and are never in autoServices
- sync_trigger: validate required fields (webhook_path, schedule_cron, schedule_prompt for orchestrator target) before calling handleTriggerAction, mirroring the route's Zod schema
- buildDebugDiagnosis: update remediation hints to use namespaced tool IDs (workflows:approve_execution, workflows:get_execution, workflows:cancel_execution, workflows:get_execution_steps)
- Add TDD regression test in session-tools.internal.test.ts covering the activation-gate bypass for internal providers
Resolves migration numbering collision with origin/main. Origin owns
0013-0015 (user_action_policy_overrides, action_policy_target_shape_triggers,
custom_mcp_connectors) which are already applied in production and cannot be
renumbered. Our 7 local migrations touch disjoint tables (triggers,
trigger_deliveries, workflow_* , messages) and are safely moved to 0016-0022
to apply after origin's.
Reconciles 83 local commits with 207 commits from origin/main after a long
divergence. Resolved a migration numbering collision (origin owns 0013-0015,
already in production; local migrations renumbered to 0016-0022 in a prior
commit) and 9 code conflicts:

- backend/images/base.py: bumped IMAGE_BUILD_VERSION to supersede both sides
- sdk integrations: kept both `internal` and `isCustomConnector` provider flags
- runner/bin.ts, agent-client.ts: kept both feature additions
- chat-container.tsx: combined workflow-step cards with thread-scoped thinking
- lib/db/triggers.ts: kept local schedule-trigger pagination
- services/session-tools.ts: merged internal-provider gate skip with origin's
  custom-connector activation logic; preserved try/catch + telemetry + orgId
- durable-objects/session-agent.ts: kept workflow-session guard + orgId resolve
- worker/index.ts: merged retention crons; rebuilt dispatchScheduledWorkflows to
  combine origin's two-pass catch-up with local delivery telemetry + globalActive

Updated internal-provider tests to mock loadCustomMcpConnectorContext.
Verified: pnpm typecheck, client build, worker tests (721), runner tests (37).
wrangler pages deploy infers --branch from the local git branch when not
specified. Deploying from a non-production branch (e.g. a reconcile/feature
branch) produces a preview deployment on a per-branch subdomain, which breaks
OAuth callbacks and authorized origins pinned to the stable Pages domain.

Default PAGES_BRANCH to main so the client always lands on the production
domain regardless of the working branch; override per-env if needed.
@yourbuddyconner yourbuddyconner requested a review from a team June 1, 2026 20:33
@socket-security
Copy link
Copy Markdown

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Addednpm/​@​types/​bun@​1.3.141001004992100
Addednpm/​@​types/​dagre@​0.7.541001007181100
Addednpm/​@​anthropic-ai/​sdk@​0.96.07310088100100
Addednpm/​dagre@​0.8.51001007975100
Addednpm/​lucide-react@​1.16.0100100989680
Addednpm/​cronstrue@​3.14.010010010085100
Addednpm/​@​xyflow/​react@​12.10.29710010088100

View full report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant