diff --git a/docs/adr/0002-automated-ai-engineer-plugin-boundary.md b/docs/adr/0002-automated-ai-engineer-plugin-boundary.md new file mode 100644 index 0000000..67196ae --- /dev/null +++ b/docs/adr/0002-automated-ai-engineer-plugin-boundary.md @@ -0,0 +1,185 @@ +# ADR 0002 — The "Automated AI Engineer" plugin boundary (generic role vs deployment config) + +- **Status:** Proposed +- **Date:** 2026-07-04 +- **Deciders:** devantler-tech maintainers +- **Issue:** [#54](https://github.com/devantler-tech/agent-plugins/issues/54) (Part of [#51](https://github.com/devantler-tech/agent-plugins/issues/51), child 1; relates to roadmap epic [#38](https://github.com/devantler-tech/agent-plugins/issues/38), Theme 1 — *realize "not skills-only" as an agents+skills bundle*) + +> 🤖 Generated by the Daily AI Assistant + +## Context + +The Daily AI Engineer — the autonomous agent that operates *and* advances every devantler-tech product — +is defined **entirely inside the monorepo**: the `AGENTS.md` shared engineering contract plus the +`.claude/` agents and skills (`daily-maintainer` agent, `portfolio-surveyor` agent, and the +`portfolio-maintenance` / `product-engineering` / `self-improvement` skills). The maintainer prompt that +opened this thread (2026-07-03): + +> *"Consider if parts of the constitution stored in `AGENTS.md` files in the monorepo and submodules +> should be made into a dedicated 'Automated AI Engineer' plugin."* + +[#51](https://github.com/devantler-tech/agent-plugins/issues/51) assessed that a large part of the +constitution is **genuinely generic** — reusable by any org that wants an autonomous engineer over a +portfolio — but is not consumable outside the monorepo, and proposed an ADR-first, three-child path +(design → extract → consume). This ADR is **child 1**. It is documentation-only and changes no manifest +or CI (mirroring ADR-0001's split of design from its implementing child), because the load-bearing risk +is not the mechanics of extraction but the **boundary**: a bad split leaves half-generic text drifting in +two places. So the ADR's job is to draw that line and **name every section's home explicitly**, giving +children 2 and 3 a contract to implement. + +This also advances the marketplace's own roadmap: [#38](https://github.com/devantler-tech/agent-plugins/issues/38) +Theme 1 wants a flagship bundle that proves plugins are *not skills-only* — an agent+skills unit at real +scale. The Automated AI Engineer is exactly that (two agents + three skills). + +**Name collision to avoid.** An existing `agentic-engineering` plugin bundles SDK / instruction-authoring +/ skill-discovery skills for **building** agentic applications; `engineering-practices` bundles practice +skills (review, testing, design). This plugin is a **third, distinct concern** — the autonomous engineer +**role/actor** itself (the run loop + the agents that run it), not tools for building agents and not a bag +of practice skills. It is named `automated-ai-engineer` to keep the three audiences unblurred. + +## The dividing line: genericity and volatility agree + +Two independent axes sort the constitution, and they **point the same way**, which is what makes a clean +cut possible: + +- **Genericity** — is this reusable by *any* org running an autonomous engineer, or is it a fact about + *this* portfolio? +- **Volatility** — how often does it change? The *role* (how to survey, select, act, keep PRs healthy, + improve safely) is **slow-moving**; the *portfolio configuration* (which repos exist, which logins are + trusted, the cadence numbers) is **fast-moving** — the constitution's hottest churn. + +The generic parts are also the slow-moving parts, and the deployment-specific parts are also the +fast-moving parts. So a single line separates **the role (→ plugin)** from **the configuration (→ the +consuming repo's `AGENTS.md`)**, and — crucially for the self-improvement loop (D3) — keeps the churn +local. + +## Decision + +### D1 — Home of every section: role → plugin, configuration → consumer `AGENTS.md` + +**Generic — the role (→ plugin: the two agents + three skills):** + +| Constitution element | Plugin home | +|---|---| +| Run loop: survey → select → act → report; the operate-before-advance ladder; the "ship ≥1 concrete artifact, aim higher" floor | `portfolio-maintenance` skill | +| Issue-driven engineering: capture-before-build, drain-oldest-actionable-first, decompose-and-start; strategy/roadmap & triage; coverage/perf/refactor/docs levers | `product-engineering` skill | +| Self-improvement procedure: evidence-from-own-runs, guard-railed, one-concern-per-PR, **never weaken a guardrail** | `self-improvement` skill | +| Draft-PR checkpoint model (autonomy up to promotion; promotion = the human gate); "stop starting, start finishing" WIP discipline | across the skills (shared engineering contract) | +| PR hygiene triad — CI **+** review threads **and review-body findings** **+** conflicts; bot-reviewer engagement | `portfolio-maintenance` skill | +| Untrusted-input discipline & the trust-gate **pattern** (exact-login matching; external-PR static-review-only; never run untrusted branch code) — the *pattern*, not the concrete logins | shared contract | +| Per-run worktree execution model + git safety | shared contract | +| The actor and the read-only surveyor | `daily-maintainer`-equivalent + `portfolio-surveyor` **agent** definitions | + +**Deployment-specific — the configuration (→ stays in the consuming repo's `AGENTS.md`):** + +| Configuration element | Why it stays local | +|---|---| +| The **portfolio map** + per-product cards + each submodule's `AGENTS.md ## Maintenance` (validate commands, protected files, labels) | Repo-specific by definition; the fastest-churning surface | +| Concrete **trust-gate logins**, merge-queue / per-repo merge mechanics, prod-access facts | Facts about *this* org's accounts and infra | +| **Memory** location & schema; **cadence** numbers; **maintainer channels** (the ask tool, draft-PR steering) | Deployment wiring, not role logic | + +The rule for a disputed section: **a section is configuration (stays local) only if it is a +*deployment-owned fact* — one that identifies *this* org's repos, accounts, infrastructure, or wiring and +so must change when the role is installed on another portfolio. Everything that describes *how to decide or +act* is role and moves to the plugin — including decision thresholds (the `ship ≥1 artifact` floor, the +`≤2 consecutive easy-tick` gate), which are portable rule logic even though they embed a number. A bare +number is not the discriminator; *deployment-owned volatility* is: cadence frequencies and per-repo merge +mechanics stay local because they parameterize this deployment, not because they contain digits.** + +### D2 — Parameterization contract: skills cite named, consumer-supplied sections + +The plugin's skills are authored against a small, fixed set of **named contract sections** that the +consuming repo's `AGENTS.md` **must** define. A skill says *"consult the **Portfolio map**"*, *"apply the +**Trust gate**"*, *"at the **Cadence**"* — abstract references the consumer fills in. The required +contract sections a consumer must supply: + +- **Portfolio map** (+ per-product `## Maintenance` cards) +- **Trust gate** — concrete trusted logins **and** per-repo merge mechanics +- **Cadence** — run frequency + per-product review/docs rotation numbers +- **Memory** — the durable-store location & schema +- **Maintainer channels** — how to reach a human decision (e.g. an ask-tool prompt, draft-PR steering) + +Two consequences for child 2: + +1. **Skills' canonical home is [agent-skills](https://github.com/devantler-tech/agent-skills)** — this + marketplace's own rule is that agent-skills is the *single source of skills* and plugins bundle them via + `gh skill install`, **never hand-copied**. So child 2 first **generalizes** `portfolio-maintenance` / + `product-engineering` / `self-improvement` into agent-skills (rewritten to the contract-section + references above, with every devantler-tech specific removed), *then* the plugin bundles them. +2. **The agents bundle in the plugin's `agents/` directory** — realizing the *not-skills-only* capability + ADR-0001 is proving out, now as a genuine **agents+skills** unit (#38 T1 at real scale). + +### D3 — Self-improvement-loop implication (the explicit, accepted trade) + +Today the agent improves its **whole** constitution through **same-repo draft PRs** in the monorepo — one +repo, one promotion gate, fast. After extraction, an improvement's path depends on its home: + +- **Generic-core** change (run loop, hygiene triad, the self-improvement procedure itself) → an + **agent-plugins** PR (and an **agent-skills** PR for a skill body) **plus** a **consumer version-bump** + PR. More indirection (release + bump) than a single monorepo PR. +- **Deployment-config** change (portfolio map, a cadence number, a product card) → an unchanged + **same-repo** monorepo PR — fast. + +The friction lands **only** on the slow-moving role; the constitution's hot path (the ~weekly portfolio +config) stays local and fast. That is the whole point of splitting on volatility (D1), and it is the +reason the trade is **accepted** rather than a regression — but it is a real trade and is recorded here so +it is a deliberate choice, not a surprise. **Guardrail unchanged:** self-improvement still may never +weaken a safety control, in whichever repo the change lands; and all three repos are inside the trust +gate, so the loop stays fully devantler-tech-autonomous (draft-PR checkpoint, maintainer promotes). + +### D4 — Sequencing and the agent-skills dependency (scope of the follow-up children) + +- **Child 2 — extract:** generalize the three skills into **agent-skills** (contract-section references, + no devantler-tech specifics) and author the engineer + surveyor **agent** definitions; assemble + `plugins/automated-ai-engineer/` — a `plugin.json` carrying only `name` / `description` / `version` / + `author` / `keywords` (component paths **omitted**, auto-discovered, per ADR-0001 D1), with the skills + bundled and the agents under `agents/`. Validate with `scripts/validate-manifests.sh` + both manifests + + the README in parity, as the gate requires. +- **Child 3 — consume:** the monorepo installs the plugin; its `AGENTS.md` shrinks to the **deployment + configuration** + the named contract sections + pointers, and per-product cards stay local. + +**Named dependency:** the generalized engineering skills need a canonical home in agent-skills *before* +the plugin can bundle them — child 2 opens the agent-skills issue for that extraction as its first step. +This ADR itself makes **no** code/manifest change (documentation-only). + +## Considered alternatives + +- **Status quo — keep the whole constitution in the monorepo.** Rejected: no portability, and #38 T1's + flagship bundle stays unrealized. The maintainer prompt explicitly asks to reconsider this. +- **Parameterize *everything*, including the portfolio configuration, into the plugin.** Rejected: it + would force a plugin release + consumer bump on **every** portfolio change — friction exactly where the + constitution churns hardest — and it inverts the volatility split that makes D3's trade acceptable. +- **Collapse the three skills into one "ai-engineer" mega-skill.** Rejected: `portfolio-maintenance` + (run loop), `product-engineering` (advance), and `self-improvement` are already the natural, separately + reusable units; merging them loses that separation and their independent installability. +- **Put the generalized skills directly in the plugin dir, skipping agent-skills.** Rejected: violates + this marketplace's *single-source-of-skills* rule — every bundled skill must have its canonical upstream + in agent-skills and is installed via `gh skill install`, never hand-copied. +- **Fold the role into the existing `agentic-engineering` (or `engineering-practices`) plugin.** Rejected: + those bundle tools for *building* agentic apps / engineering *practices*; the autonomous-engineer role is + a distinct concern and audience, and co-bundling would blur all three (see *Context*). + +## Consequences + +**Positive** + +- Realizes #38 Theme 1's flagship **agents+skills** bundle at real scale, and the portability principle + (tool-neutral; Copilot CLI reads the same plugin schema — ADR-0001). +- Reuse beyond this org: any portfolio can install the role and supply its own contract sections. +- Forces the constitution to separate **role** from **configuration** cleanly — a clarity win even for the + monorepo, independent of the extraction landing. + +**Negative / risks** + +- **Iteration friction on the role** (release + consumer bump) — mitigated by the volatility split: only + the slow-moving role pays it; the hot portfolio config stays local (D3). +- **Drift if the split is sloppy** — mitigated by D1's per-section home table and the disputed-section + rule; children 2/3 must keep to it. +- **A dual-/tri-repo self-improvement loop** — more moving parts than one monorepo PR; bounded, still + autonomous, and never permitted to weaken a guardrail. + +## Follow-up + +- **Child 2 (extract)** and **Child 3 (consume)** of #51 — filed when this ADR is accepted. +- The agent-skills home for the three generalized skills — an agent-skills issue opened at the start of + child 2 (the D4 dependency).