Workflows V2: chat-integrated execution, remote tools, triggers & approvals#4
Open
yourbuddyconner wants to merge 210 commits into
Open
Workflows V2: chat-integrated execution, remote tools, triggers & approvals#4yourbuddyconner wants to merge 210 commits into
yourbuddyconner wants to merge 210 commits into
Conversation
The workflow-*.test.ts files use bun:test and exercise Bun.spawn code paths, so they don't run under vitest in Node. They had been silently excluded from CI since the runner package adopted vitest. - Fix stale workflow-compiler test fixture (tool steps now require a tool field per commit 9a70a880's validation tightening). - Run them via a new test:workflows script that pnpm test invokes after vitest, so the full runner test suite stays a single command. - Drop the stale "no workflow definition editor in the client" claim from the workflows spec; the editor lives in packages/client/src/components/workflows/edit-workflow-step-dialog.tsx.
Captures the design decisions from the brainstorm: - Rename /automation/triggers to /automation/schedules-and-hooks; same data, clearer cards (humanized cron, type badges, plain target line, gray-out-in-place for disabled). - /automation/workflows/new: chat-to-create. User intent → LLM-drafted workflow JSON → React Flow diagram. Both whole-workflow and per-step refine modes. Trigger config required before save. JSON behind a side panel toggle. - /automation/workflows/$workflowId becomes view + entry point; edit opens the same create-style flow with the workflow as the draft. Strips down the 2151-line page. - /automation/executions/$executionId: live execution view. React Flow diagram with per-step status overlay. Step trace, variables, cancel and approve/deny actions. - Real-time step events as MVP scope: runner emits per-step events over the existing runner-link WebSocket; SessionAgentDO upserts to D1 and publishes to EventBus; client subscribes — no polling. - Shared <WorkflowDiagram> component (React Flow + dagre) in three modes: edit / view / runtime. Also ignores .superpowers/ (brainstorming companion artifacts).
…erts step + publishes
Adds POST /api/workflows/draft which calls the workflow-draft service with the user prompt and validates the result via validateWorkflowDefinition, retrying up to 3 times on invalid output. Widens draftWorkflow's baseDraft param to Record<string, unknown> so zod-validated payloads can flow through without a cast (the LLM output is validated rigorously after generation).
Adds POST /api/workflows/draft/step which constructs a step-scoped prompt instructing the LLM to edit only the named step and preserve all others. Reuses the same validate/retry loop as /draft.
Multi-iteration loops now open on the "all" tab so every iteration's steps are visible at once, consistent with the execution page's expand-by-default behavior. Single-iteration loops still show that one iteration. Users can still click an individual iter tab to focus.
… use
Bumps the agent_prompt await timeout default from 120s to 300s (5 min,
still capped at the 15-min ceiling). Short default was cutting off
open-ended, tool-using prompts ("invoke this skill and follow
instructions", "investigate and fix the failing test"), which are now a
first-class use case.
Documents that agent_prompt has full tool access (read/edit/bash/grep/
MCP/skills, operating on the repo checkout when provided) in both the
workflows skill and the draft-LLM system prompt, plus the one exception
(the interactive question tool fails in workflow context). Notes the
new default + when to raise the timeout.
Syncs the inlined skill content (loop docs, inline-array over, agent_prompt tool-use + 5min timeout, conditional/notify corrections) into the D1 content registry. The skill .md was committed earlier; this is the generated copy that gets synced to sandboxes.
agent_prompt steps now stream a live assistant turn (text + tool calls) into the session chat, attributed to their step, instead of only emitting one workflow-chat-message at the end. How: workflow agent sessions run on the ephemeral-session path, whose events early-returned before reaching the streaming handler. When the session is workflow-attributed (channel.workflowStepContext set by executeWorkflowAgentStep, which also sets activeMessageId), the ephemeral branch routes message.part.updated/delta to handlePartUpdated/ handlePartDelta. The turn carries executionId/stepId/iterationPath via message.create so the client can group it under a per-step container. The separate assistant workflow-chat-message send is dropped (the streamed turn is the response); the user prompt message is kept. Turn lifecycle (addresses codex review): - Intermediate attempts (structured-output fixup, model failover) finalize as 'canceled' before their retry's resetPromptState, so the client can collapse them as prior attempts rather than showing invalid JSON as a completed turn. - The kept attempt finalizes 'end_turn' (or 'error') in the outer finally. - finalizeWorkflowTurn flushes any pending/running tools to completed so a closed turn never shows a forever-running tool. Known pre-existing gap (not addressed here): @new ephemeral threads bypass the main message.updated/question.asked handlers, so usage/model attribution and the question-not-supported guard are degraded for those. Default/named threads that adopt the main session are unaffected. The step card sources model/tokens from step output, so the UI is covered.
The streamed workflow agent_prompt turn arrives via message.create. The DO handler now validates the runner-supplied executionId against the session's user (extracted into a shared resolveWorkflowBackpointers helper, reused by the workflow-chat-message handler) and threads the back-pointers through createTurn → the turn's message row, plus the client broadcast. Also fixes the message READ path (getSessionMessages / getThreadMessages) which dropped the workflow columns in its row mapper — so grouping works on refresh/initial-load, not just live over the WS. Extracted a shared mapMessageRow.
…iner in chat The session chat now wraps an agent_prompt step's prompt + streamed assistant turn(s) in a bordered per-step container with a header (step name, persona, iteration, model/tokens, live status), instead of loose messages. Non-agent steps (bash/notify/loop/…) still render as step cards; plain chat renders as normal turns. - buildChatRenderPlan (pure, tested) groups the merged feed: workflow- attributed messages + their step row → one step-container keyed by (executionId, stepId, iterationPath), emitted at the earliest item's position; steps with no attributed messages → step card; plain messages → turn runs. Loop iterations get distinct containers. - WorkflowStepContainer renders the header + delegates message rendering back to MessageList via a callback (avoids a circular import), so prompts/assistant turns render identically inside and outside the container, including streamed tool-call parts. - The agent_prompt step card is no longer shown standalone in chat (the container represents it); the execution page still shows the card summary. Follow-up polish (not blocking): render structured-output as a parsed card inside the container rather than raw streamed JSON; collapse canceled prior attempts behind an affordance (the runner already finalizes them 'canceled').
…ep container Two polish items on the workflow step container (codex UX recs #2/#3): - Structured-output steps now render the parsed result as a kv-table card as the primary view; the raw streamed JSON turn is tucked into a collapsible "raw output" panel. Non-structured steps still render the assistant turn directly. During streaming (before output exists) it falls through to the streamed turn, then swaps to the card on completion. - Superseded attempts (model failover / structured-output fixup) — the assistant turns the runner finalized 'canceled', detected via their {type:'finish', reason:'canceled'} part — collapse behind a "N previous attempts" affordance so the latest attempt reads cleanly.
…flow threads The garbled/interleaved chat text (e.g. "ThereThere's no...of's no...of any kind", JSON tripled) was the same SSE event being appended to streamedContent twice. Workflow agent threads that adopt the MAIN session (eventSessionId === this.sessionId — the common case, which is why usage/model attribution worked) hit the ephemeral branch (session is registered in ephemeralContent for response capture) AND, because the `eventSessionId !== this.sessionId` early-return doesn't fire for them, ALSO fall through to the main switch. My ephemeral-branch streaming addition therefore ran handlePartUpdated/handlePartDelta once, then the main switch ran them again — double (delta+snapshot → triple) append. Fix: gate the ephemeral-branch streaming on eventSessionId !== this.sessionId. Genuinely-ephemeral (@new) threads stream via the ephemeral branch (the main switch is skipped by the early-return); main-session threads stream via the main switch only. ensureTurnCreated reads workflowStepContext on either path, so the turn carries its step back-pointers either way. finalizeWorkflowTurn no-ops when the main handler already finalized (turnCreated reset), avoiding double-finalize.
The sync_workflow OpenCode tool carried a stale step-type allowlist (agent, agent_message, subworkflow) that rejected valid agent_prompt/notify steps the worker had since migrated to. Remove the tool's duplicate type taxonomy and let the worker be the authoritative gate; add a positive allowlist to the worker validator so unknown/typo'd types fail at save time with a clear message instead of mid-execution.
…tions Serve the 22 baked workflow/trigger/execution/proposal OpenCode tools as worker-side actions via the existing list_tools/call_tool path, introducing an "internal provider" concept (credential-less, worker-DB access) in the action framework. Eliminates the image-rebuild tax and validator drift.
…viders Computes an internalHandle when provider.internal is truthy and threads it into both execute() call sites in executeAction so internal providers receive worker-side db/env access without credential resolution.
Inject internal providers (where provider.internal === true) into the serviceSourceMap after the autoServices loop so they appear in listTools without requiring a connected integration or credentials.
…gger tools Adds a credential-less internal IntegrationProvider + ActionSource under packages/worker/src/integrations/internal/workflows/ exposing all 22 workflow/trigger/execution/proposal actions. Each action delegates to the same service/db functions the route handlers call — no logic reimplemented.
…ger duplicate handling
Adds the workflows package to internalIntegrations so IntegrationRegistry resolves the 'workflows' service with its 22 actions. Uses lazy dynamic imports in actions.ts to break the pre-existing circular dependency chain (lib/db → workflow-runtime → orchestrator → env-assembly → credentials → registry → internal/index) that would cause internalIntegrations to be undefined during module initialization.
…o break import cycle Move registration of worker-internal integration packages (workflows) from IntegrationRegistry.init() to the composition root (index.ts), eliminating the circular dependency: registry → internal/workflows/actions → lib/db → services → credentials → registry. Adds IntegrationRegistry.registerPackage() for runtime registration, restores static imports in actions.ts (removes lazy import() band-aid), and updates the test to call registerInternalIntegrations() explicitly.
Re-enforces the denyInWorkflowSession guard from the baked opencode tool files inside handleCallTool. When IS_WORKFLOW_SESSION=true is present in the session spawnRequest envVars, any call_tool invocation targeting the workflows service is rejected early with an explanatory error before policy resolution runs.
Workflow tools are now worker-side actions under the workflows service rather than always-loaded named tools. The skill now instructs the agent to call list_tools(service=workflows) first, then invoke tools via call_tool with namespaced ids (workflows:<action>). All bare tool-name references updated to use the workflows: prefix throughout, and a note added that destructive operations (delete_workflow, delete_trigger, rollback_workflow) route through the action risk policy and may require human approval.
…worker The 22 workflow/trigger/execution/proposal tools are now served as worker-side actions via call_tool. Remove the dead baked copies from docker/opencode/tools/ along with the _workflow_session_guard.ts helper (only used by those tools). Bump IMAGE_BUILD_VERSION to bust the sandbox image cache and drop the removed files from the built image.
…date orchestrator spec The blanket denial of all workflows:* tools in IS_WORKFLOW_SESSION sandboxes is replaced by a selective guard that only rejects the 8 self-mutation actions (sync_workflow, update_workflow, delete_workflow, rollback_workflow, sync_trigger, delete_trigger, review_workflow_proposal, apply_workflow_proposal). Read, list, run, and execution-control tools are now fully available in workflow sessions. Adds WORKFLOW_SESSION_DENIED_ACTIONS constant at module scope, updates the guard condition in SessionAgentDO.handleCallTool, adds a test for the allowed list_workflows path, and rewrites the Workflow Session Tool Policy section in docs/specs/orchestrator.md to reflect the DO-layer enforcement.
…e trigger fields; namespace debug hints - resolveActionPolicy: hoist getProvider lookup and skip the integration-activation gate for internal providers (e.g. workflows) which have no integration row and are never in autoServices - sync_trigger: validate required fields (webhook_path, schedule_cron, schedule_prompt for orchestrator target) before calling handleTriggerAction, mirroring the route's Zod schema - buildDebugDiagnosis: update remediation hints to use namespaced tool IDs (workflows:approve_execution, workflows:get_execution, workflows:cancel_execution, workflows:get_execution_steps) - Add TDD regression test in session-tools.internal.test.ts covering the activation-gate bypass for internal providers
Resolves migration numbering collision with origin/main. Origin owns 0013-0015 (user_action_policy_overrides, action_policy_target_shape_triggers, custom_mcp_connectors) which are already applied in production and cannot be renumbered. Our 7 local migrations touch disjoint tables (triggers, trigger_deliveries, workflow_* , messages) and are safely moved to 0016-0022 to apply after origin's.
Reconciles 83 local commits with 207 commits from origin/main after a long divergence. Resolved a migration numbering collision (origin owns 0013-0015, already in production; local migrations renumbered to 0016-0022 in a prior commit) and 9 code conflicts: - backend/images/base.py: bumped IMAGE_BUILD_VERSION to supersede both sides - sdk integrations: kept both `internal` and `isCustomConnector` provider flags - runner/bin.ts, agent-client.ts: kept both feature additions - chat-container.tsx: combined workflow-step cards with thread-scoped thinking - lib/db/triggers.ts: kept local schedule-trigger pagination - services/session-tools.ts: merged internal-provider gate skip with origin's custom-connector activation logic; preserved try/catch + telemetry + orgId - durable-objects/session-agent.ts: kept workflow-session guard + orgId resolve - worker/index.ts: merged retention crons; rebuilt dispatchScheduledWorkflows to combine origin's two-pass catch-up with local delivery telemetry + globalActive Updated internal-provider tests to mock loadCustomMcpConnectorContext. Verified: pnpm typecheck, client build, worker tests (721), runner tests (37).
wrangler pages deploy infers --branch from the local git branch when not specified. Deploying from a non-production branch (e.g. a reconcile/feature branch) produces a preview deployment on a per-branch subdomain, which breaks OAuth callbacks and authorized origins pinned to the stable Pages domain. Default PAGES_BRANCH to main so the client always lands on the production domain regardless of the working branch; override per-env if needed.
|
Review the following changes in direct dependencies. Learn more about Socket for GitHub.
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Workflows V2 — moves workflow execution into a first-class, chat-integrated experience and relocates workflow tooling to the worker. Developed on a long-lived branch; this PR also folds in
origin/mainvia a merge (see Merge notes).What's in Workflows V2
list_tools/call_tool.loop.over(path or inline array literal),agent_prompt+notify, 5-min defaultagent_prompttimeout.agent_promptturns stream into the session chat; step cards interleaved by timestamp (useSessionFeed),WorkflowContextBar, structured-result cards, per-step containers — gated behindworkflow_ui_chat_cards.trigger_deliveriestelemetry, schedule-tick dedup.workflows:*tools.Merge notes (origin/main reconciliation)
0013–0015(already in prod); our 7 migrations renumbered to0016–0022(disjoint tables). Final sequence0001–0022.session-tools.ts(internal-provider gate + custom-connector logic) anddispatchScheduledWorkflows(origin's two-pass catch-up + our delivery telemetry).Verification
pnpm typecheckclean; client build clean; worker 721/721, runner 37/37.Test plan
session-tools.tsanddispatchScheduledWorkflowsmerges0016–0022apply in order