You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
cortex-class pipelines (turn a GitHub issue into a merged PR) run today as imperative python-func code living in the external dap-cortex package. A portable DAP bundle can only reference that code (runtime_id: python-func, callable_path: cortex.nodes.coder:run) — it can't contain the behaviour, so the JSON does nothing without the package installed in the engine venv.
python-func is DAP's deliberate escape hatch for imperative logic, and that's fine. But a large slice of what cortex does is generic (git, GitHub, guards, retry, sandboxed exec) and could be native, declaratively-configured DAP runtimes. If we add them, cortex-class pipelines become near-pure JSON (graph + agents that reference built-in runtimes + backend_profiles), with python-func left only for truly bespoke bits.
This epic captures (A) the runtimes/primitives that would make cortex expressible declaratively, (B) the keystone "tool-calling agent" runtime, and (C) a brainstorm of new runtimes beyond what cortex even has.
Background
DAP runtimes today: api-call, bash, http, claude-code, gemini-cli, codex, aider, python-func (adapters in packages/runtimes/src/dap_runtimes/adapters/, contract in base.py; dashboard schemas in apps/dashboard/src/components/agents/runtime-config-schemas.ts).
Per-node LLM selection already exists: backend_profiles (available + agent_assignments), resolved at run time in apps/engine/src/dap_engine/execution/backend_profiles.py.
Conditional routing already exists (edge comparison/logical conditions reading state). What's missing for cortex-class flows is the node-body capabilities that write that state.
Add generic, reusable runtimes so cortex-class behaviour is authorable as JSON config, not external code.
Each runtime: a BaseAdapter implementation + registry entry + dashboard runtime-config schema + TDD tests + docs.
Keep the engine generic — these are domain-agnostic primitives (git/GitHub/exec/retry), not cortex-specific.
Non-goals
Deleting python-func — it stays the escape hatch for genuinely bespoke logic (e.g. dispatcher issue-decomposition).
Re-implementing cortex inside this repo. We provide primitives; a thin bundle composes them.
Absorbing cortex's exact prompts/policies — those stay data (prompt_template, backend_profiles).
Part A — Runtimes/primitives to make cortex declarative
Each is an independently shippable slice.
github runtime — declarative GitHub ops: op: read_issue | comment | update_issue_section | create_branch | open_pr | merge_pr | read_pr, params templated from state; token via DAP env layering (role-separated tokens map to instance/project env vars). Replaces the GitHub half of every cortex node.
git/workspace-aware code runtime — either a dedicated git runtime or config flags on the existing code runtimes (claude-code/codex/aider): { branch, base, push, workspace } so the branch/checkout/push lifecycle is declarative instead of hand-coded in execution.py.
Declarative guards (flags on the code/git runtime): require_nonempty_diff (silent-zero-output), append_only (reflog rewrite guard), ancestry_guard (no force-push over operator commits). Lifted 1:1 from cortex/nodes/execution.py.
Backend fallback chain — extend backend_profiles with fallback: [...] (cortex already has this in agents.yaml). Engine tries the chain on BackendError.
Retry-policy primitive — declarative loop: { max_retries, on_failure: <route>, feedback_into: <state field> } so the tester→retry→coder loop (and its *_status short-circuits, see cortex _route_after_tester) is config, not Python. Needs a small condition/expression story for multi-field guards.
Workspace introspection into state — runtime exposes git diff/changed-files/commits to state so downstream prompt_template (Jinja) and edge conditions can use them declaratively (cortex's _collect_commits, retry-context).
Per-task budgets — surface max_turns/timeout per node/task in config (partially exists).
Part B — Keystone: tool-calling agent runtime
agent/tool-loop runtime — an LLM node that runs an agentic loop with a declarative toolset composed of other DAP runtimes (e.g. tools: [github, git, bash-sandboxed]). This is the real unlock: "an agent that reads the issue, edits code, runs tests, opens a PR" becomes JSON that wires built-in tools, instead of bespoke python-func. Bounded by max_turns/budget; every tool call audited.
Part C — New runtimes worth exploring (beyond what cortex has)
Group as exploration; spin promising ones into their own issues.
Execution & safety
Sandboxed exec runtime — run code in an isolated container (not in-process like cortex's bash, which runs with engine privileges). A genuine security upgrade cortex lacks.
Cost/budget guard primitive — halt/reroute when cumulative spend exceeds a cap.
Slice 3 (keystone): tool-calling agent runtime composing the above as tools.
Exploration: pick 1-2 Part C runtimes by demand (sandboxed exec + sub-pipeline are strong candidates).
Acceptance criteria (per slice)
New adapter implements the BaseAdapter contract; registered in the runtime registry.
Dashboard runtime-config schema added (runtime-config-schemas.ts) so the agent editor renders the config form.
TDD: unit + integration tests (real behaviour, not file-existence checks — per .claude/rules/infrastructure-quality.md); no clear-text secret logging.
Docs + a minimal example bundle showing the runtime in a pipeline.
ruff/mypy/pytest + dashboard typecheck/build green; Council review passes.
Open questions
Retry/guard multi-field conditions: extend edge conditions with a small expression language, or a dedicated retry-policy node?
Do git/GitHub belong as standalone runtimes or as tools consumed only by the Part B agent runtime? (Probably both: standalone for simple nodes, tools for the agent loop.)
Sandboxing model for the exec runtime (container image, network policy, filesystem scope).
Strategic note
This is a product decision as much as engineering: do we want the engine to grow an opinionated SDLC/DevOps toolkit (git/GitHub/guards), or stay generic and keep that domain in packages via python-func? Recommendation: build the generic, domain-agnostic primitives (github/git/exec/retry/sub-pipeline) — they serve far more than cortex — and let domain policy stay in JSON bundles.
Summary
cortex-class pipelines (turn a GitHub issue into a merged PR) run today as imperative
python-funccode living in the externaldap-cortexpackage. A portable DAP bundle can only reference that code (runtime_id: python-func,callable_path: cortex.nodes.coder:run) — it can't contain the behaviour, so the JSON does nothing without the package installed in the engine venv.python-funcis DAP's deliberate escape hatch for imperative logic, and that's fine. But a large slice of what cortex does is generic (git, GitHub, guards, retry, sandboxed exec) and could be native, declaratively-configured DAP runtimes. If we add them, cortex-class pipelines become near-pure JSON (graph + agents that reference built-in runtimes +backend_profiles), withpython-funcleft only for truly bespoke bits.This epic captures (A) the runtimes/primitives that would make cortex expressible declaratively, (B) the keystone "tool-calling agent" runtime, and (C) a brainstorm of new runtimes beyond what cortex even has.
Background
api-call,bash,http,claude-code,gemini-cli,codex,aider,python-func(adapters inpackages/runtimes/src/dap_runtimes/adapters/, contract inbase.py; dashboard schemas inapps/dashboard/src/components/agents/runtime-config-schemas.ts).backend_profiles(available+agent_assignments), resolved at run time inapps/engine/src/dap_engine/execution/backend_profiles.py.comparison/logicalconditions reading state). What's missing for cortex-class flows is the node-body capabilities that write that state.Related: #739 (managed/bundled cortex agents),
area:cortex-integration.Goals
BaseAdapterimplementation + registry entry + dashboard runtime-config schema + TDD tests + docs.Non-goals
python-func— it stays the escape hatch for genuinely bespoke logic (e.g. dispatcher issue-decomposition).prompt_template,backend_profiles).Part A — Runtimes/primitives to make cortex declarative
Each is an independently shippable slice.
githubruntime — declarative GitHub ops:op: read_issue | comment | update_issue_section | create_branch | open_pr | merge_pr | read_pr, params templated from state; token via DAP env layering (role-separated tokens map to instance/project env vars). Replaces the GitHub half of every cortex node.git/workspace-aware code runtime — either a dedicatedgitruntime or config flags on the existing code runtimes (claude-code/codex/aider):{ branch, base, push, workspace }so the branch/checkout/push lifecycle is declarative instead of hand-coded inexecution.py.require_nonempty_diff(silent-zero-output),append_only(reflog rewrite guard),ancestry_guard(no force-push over operator commits). Lifted 1:1 fromcortex/nodes/execution.py.backend_profileswithfallback: [...](cortex already has this inagents.yaml). Engine tries the chain onBackendError.{ max_retries, on_failure: <route>, feedback_into: <state field> }so the tester→retry→coder loop (and its*_statusshort-circuits, see cortex_route_after_tester) is config, not Python. Needs a small condition/expression story for multi-field guards.git diff/changed-files/commits tostateso downstreamprompt_template(Jinja) and edge conditions can use them declaratively (cortex's_collect_commits, retry-context).max_turns/timeoutper node/task in config (partially exists).Part B — Keystone: tool-calling agent runtime
agent/tool-loopruntime — an LLM node that runs an agentic loop with a declarative toolset composed of other DAP runtimes (e.g.tools: [github, git, bash-sandboxed]). This is the real unlock: "an agent that reads the issue, edits code, runs tests, opens a PR" becomes JSON that wires built-in tools, instead of bespokepython-func. Bounded bymax_turns/budget; every tool call audited.Part C — New runtimes worth exploring (beyond what cortex has)
Group as exploration; spin promising ones into their own issues.
Execution & safety
bash, which runs with engine privileges). A genuine security upgrade cortex lacks.Composition & scale
callruntime — a node that invokes another pipeline (reuse/modularity). Huge for building libraries of flows.Data & retrieval
Human & integration
kubectl/terraform/dockerops with guards.Quality
Suggested phasing
githubruntime + git/push/guard flags on a code runtime → ~80% of cortex's node bodies become declarative.agentruntime composing the above as tools.Acceptance criteria (per slice)
BaseAdaptercontract; registered in the runtime registry.runtime-config-schemas.ts) so the agent editor renders the config form..claude/rules/infrastructure-quality.md); no clear-text secret logging.ruff/mypy/pytest+ dashboardtypecheck/buildgreen; Council review passes.Open questions
Strategic note
This is a product decision as much as engineering: do we want the engine to grow an opinionated SDLC/DevOps toolkit (git/GitHub/guards), or stay generic and keep that domain in packages via
python-func? Recommendation: build the generic, domain-agnostic primitives (github/git/exec/retry/sub-pipeline) — they serve far more than cortex — and let domain policy stay in JSON bundles.