Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
185 changes: 185 additions & 0 deletions docs/adr/0002-automated-ai-engineer-plugin-boundary.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,185 @@
# ADR 0002 β€” The "Automated AI Engineer" plugin boundary (generic role vs deployment config)

- **Status:** Proposed
- **Date:** 2026-07-04
- **Deciders:** devantler-tech maintainers
- **Issue:** [#54](https://github.com/devantler-tech/agent-plugins/issues/54) (Part of [#51](https://github.com/devantler-tech/agent-plugins/issues/51), child 1; relates to roadmap epic [#38](https://github.com/devantler-tech/agent-plugins/issues/38), Theme 1 β€” *realize "not skills-only" as an agents+skills bundle*)

> πŸ€– Generated by the Daily AI Assistant

## Context

The Daily AI Engineer β€” the autonomous agent that operates *and* advances every devantler-tech product β€”
is defined **entirely inside the monorepo**: the `AGENTS.md` shared engineering contract plus the
`.claude/` agents and skills (`daily-maintainer` agent, `portfolio-surveyor` agent, and the
`portfolio-maintenance` / `product-engineering` / `self-improvement` skills). The maintainer prompt that
opened this thread (2026-07-03):

> *"Consider if parts of the constitution stored in `AGENTS.md` files in the monorepo and submodules
> should be made into a dedicated 'Automated AI Engineer' plugin."*

[#51](https://github.com/devantler-tech/agent-plugins/issues/51) assessed that a large part of the
constitution is **genuinely generic** β€” reusable by any org that wants an autonomous engineer over a
portfolio β€” but is not consumable outside the monorepo, and proposed an ADR-first, three-child path
(design β†’ extract β†’ consume). This ADR is **child 1**. It is documentation-only and changes no manifest
or CI (mirroring ADR-0001's split of design from its implementing child), because the load-bearing risk
is not the mechanics of extraction but the **boundary**: a bad split leaves half-generic text drifting in
two places. So the ADR's job is to draw that line and **name every section's home explicitly**, giving
children 2 and 3 a contract to implement.

This also advances the marketplace's own roadmap: [#38](https://github.com/devantler-tech/agent-plugins/issues/38)
Theme 1 wants a flagship bundle that proves plugins are *not skills-only* β€” an agent+skills unit at real
scale. The Automated AI Engineer is exactly that (two agents + three skills).

**Name collision to avoid.** An existing `agentic-engineering` plugin bundles SDK / instruction-authoring
/ skill-discovery skills for **building** agentic applications; `engineering-practices` bundles practice
skills (review, testing, design). This plugin is a **third, distinct concern** β€” the autonomous engineer
**role/actor** itself (the run loop + the agents that run it), not tools for building agents and not a bag
of practice skills. It is named `automated-ai-engineer` to keep the three audiences unblurred.

## The dividing line: genericity and volatility agree

Two independent axes sort the constitution, and they **point the same way**, which is what makes a clean
cut possible:

- **Genericity** β€” is this reusable by *any* org running an autonomous engineer, or is it a fact about
*this* portfolio?
- **Volatility** β€” how often does it change? The *role* (how to survey, select, act, keep PRs healthy,
improve safely) is **slow-moving**; the *portfolio configuration* (which repos exist, which logins are
trusted, the cadence numbers) is **fast-moving** β€” the constitution's hottest churn.

The generic parts are also the slow-moving parts, and the deployment-specific parts are also the
fast-moving parts. So a single line separates **the role (β†’ plugin)** from **the configuration (β†’ the
consuming repo's `AGENTS.md`)**, and β€” crucially for the self-improvement loop (D3) β€” keeps the churn
local.

## Decision

### D1 β€” Home of every section: role β†’ plugin, configuration β†’ consumer `AGENTS.md`

**Generic β€” the role (β†’ plugin: the two agents + three skills):**

| Constitution element | Plugin home |
|---|---|
| Run loop: survey β†’ select β†’ act β†’ report; the operate-before-advance ladder; the "ship β‰₯1 concrete artifact, aim higher" floor | `portfolio-maintenance` skill |
| Issue-driven engineering: capture-before-build, drain-oldest-actionable-first, decompose-and-start; strategy/roadmap & triage; coverage/perf/refactor/docs levers | `product-engineering` skill |
| Self-improvement procedure: evidence-from-own-runs, guard-railed, one-concern-per-PR, **never weaken a guardrail** | `self-improvement` skill |
| Draft-PR checkpoint model (autonomy up to promotion; promotion = the human gate); "stop starting, start finishing" WIP discipline | across the skills (shared engineering contract) |
| PR hygiene triad β€” CI **+** review threads **and review-body findings** **+** conflicts; bot-reviewer engagement | `portfolio-maintenance` skill |
| Untrusted-input discipline & the trust-gate **pattern** (exact-login matching; external-PR static-review-only; never run untrusted branch code) β€” the *pattern*, not the concrete logins | shared contract |
| Per-run worktree execution model + git safety | shared contract |
| The actor and the read-only surveyor | `daily-maintainer`-equivalent + `portfolio-surveyor` **agent** definitions |

**Deployment-specific β€” the configuration (β†’ stays in the consuming repo's `AGENTS.md`):**

| Configuration element | Why it stays local |
|---|---|
| The **portfolio map** + per-product cards + each submodule's `AGENTS.md ## Maintenance` (validate commands, protected files, labels) | Repo-specific by definition; the fastest-churning surface |
| Concrete **trust-gate logins**, merge-queue / per-repo merge mechanics, prod-access facts | Facts about *this* org's accounts and infra |
| **Memory** location & schema; **cadence** numbers; **maintainer channels** (the ask tool, draft-PR steering) | Deployment wiring, not role logic |

The rule for a disputed section: **a section is configuration (stays local) only if it is a
*deployment-owned fact* β€” one that identifies *this* org's repos, accounts, infrastructure, or wiring and
so must change when the role is installed on another portfolio. Everything that describes *how to decide or
act* is role and moves to the plugin β€” including decision thresholds (the `ship β‰₯1 artifact` floor, the
`≀2 consecutive easy-tick` gate), which are portable rule logic even though they embed a number. A bare
number is not the discriminator; *deployment-owned volatility* is: cadence frequencies and per-repo merge
mechanics stay local because they parameterize this deployment, not because they contain digits.**

### D2 β€” Parameterization contract: skills cite named, consumer-supplied sections

The plugin's skills are authored against a small, fixed set of **named contract sections** that the
consuming repo's `AGENTS.md` **must** define. A skill says *"consult the **Portfolio map**"*, *"apply the
**Trust gate**"*, *"at the **Cadence**"* β€” abstract references the consumer fills in. The required
contract sections a consumer must supply:

- **Portfolio map** (+ per-product `## Maintenance` cards)
- **Trust gate** β€” concrete trusted logins **and** per-repo merge mechanics
- **Cadence** β€” run frequency + per-product review/docs rotation numbers
- **Memory** β€” the durable-store location & schema
- **Maintainer channels** β€” how to reach a human decision (e.g. an ask-tool prompt, draft-PR steering)

Two consequences for child 2:

1. **Skills' canonical home is [agent-skills](https://github.com/devantler-tech/agent-skills)** β€” this
marketplace's own rule is that agent-skills is the *single source of skills* and plugins bundle them via
`gh skill install`, **never hand-copied**. So child 2 first **generalizes** `portfolio-maintenance` /
`product-engineering` / `self-improvement` into agent-skills (rewritten to the contract-section
references above, with every devantler-tech specific removed), *then* the plugin bundles them.
2. **The agents bundle in the plugin's `agents/` directory** β€” realizing the *not-skills-only* capability
ADR-0001 is proving out, now as a genuine **agents+skills** unit (#38 T1 at real scale).

### D3 β€” Self-improvement-loop implication (the explicit, accepted trade)

Today the agent improves its **whole** constitution through **same-repo draft PRs** in the monorepo β€” one
repo, one promotion gate, fast. After extraction, an improvement's path depends on its home:

- **Generic-core** change (run loop, hygiene triad, the self-improvement procedure itself) β†’ an
**agent-plugins** PR (and an **agent-skills** PR for a skill body) **plus** a **consumer version-bump**
PR. More indirection (release + bump) than a single monorepo PR.
- **Deployment-config** change (portfolio map, a cadence number, a product card) β†’ an unchanged
**same-repo** monorepo PR β€” fast.

The friction lands **only** on the slow-moving role; the constitution's hot path (the ~weekly portfolio
config) stays local and fast. That is the whole point of splitting on volatility (D1), and it is the
reason the trade is **accepted** rather than a regression β€” but it is a real trade and is recorded here so
it is a deliberate choice, not a surprise. **Guardrail unchanged:** self-improvement still may never
weaken a safety control, in whichever repo the change lands; and all three repos are inside the trust
gate, so the loop stays fully devantler-tech-autonomous (draft-PR checkpoint, maintainer promotes).

### D4 β€” Sequencing and the agent-skills dependency (scope of the follow-up children)

- **Child 2 β€” extract:** generalize the three skills into **agent-skills** (contract-section references,
no devantler-tech specifics) and author the engineer + surveyor **agent** definitions; assemble
`plugins/automated-ai-engineer/` β€” a `plugin.json` carrying only `name` / `description` / `version` /
`author` / `keywords` (component paths **omitted**, auto-discovered, per ADR-0001 D1), with the skills
bundled and the agents under `agents/`. Validate with `scripts/validate-manifests.sh` + both manifests +
the README in parity, as the gate requires.
- **Child 3 β€” consume:** the monorepo installs the plugin; its `AGENTS.md` shrinks to the **deployment
configuration** + the named contract sections + pointers, and per-product cards stay local.

**Named dependency:** the generalized engineering skills need a canonical home in agent-skills *before*
the plugin can bundle them β€” child 2 opens the agent-skills issue for that extraction as its first step.
This ADR itself makes **no** code/manifest change (documentation-only).

## Considered alternatives

- **Status quo β€” keep the whole constitution in the monorepo.** Rejected: no portability, and #38 T1's
flagship bundle stays unrealized. The maintainer prompt explicitly asks to reconsider this.
- **Parameterize *everything*, including the portfolio configuration, into the plugin.** Rejected: it
would force a plugin release + consumer bump on **every** portfolio change β€” friction exactly where the
constitution churns hardest β€” and it inverts the volatility split that makes D3's trade acceptable.
- **Collapse the three skills into one "ai-engineer" mega-skill.** Rejected: `portfolio-maintenance`
(run loop), `product-engineering` (advance), and `self-improvement` are already the natural, separately
reusable units; merging them loses that separation and their independent installability.
- **Put the generalized skills directly in the plugin dir, skipping agent-skills.** Rejected: violates
this marketplace's *single-source-of-skills* rule β€” every bundled skill must have its canonical upstream
in agent-skills and is installed via `gh skill install`, never hand-copied.
- **Fold the role into the existing `agentic-engineering` (or `engineering-practices`) plugin.** Rejected:
those bundle tools for *building* agentic apps / engineering *practices*; the autonomous-engineer role is
a distinct concern and audience, and co-bundling would blur all three (see *Context*).

## Consequences

**Positive**

- Realizes #38 Theme 1's flagship **agents+skills** bundle at real scale, and the portability principle
(tool-neutral; Copilot CLI reads the same plugin schema β€” ADR-0001).
- Reuse beyond this org: any portfolio can install the role and supply its own contract sections.
- Forces the constitution to separate **role** from **configuration** cleanly β€” a clarity win even for the
monorepo, independent of the extraction landing.

**Negative / risks**

- **Iteration friction on the role** (release + consumer bump) β€” mitigated by the volatility split: only
the slow-moving role pays it; the hot portfolio config stays local (D3).
- **Drift if the split is sloppy** β€” mitigated by D1's per-section home table and the disputed-section
rule; children 2/3 must keep to it.
- **A dual-/tri-repo self-improvement loop** β€” more moving parts than one monorepo PR; bounded, still
autonomous, and never permitted to weaken a guardrail.

## Follow-up

- **Child 2 (extract)** and **Child 3 (consume)** of #51 β€” filed when this ADR is accepted.
- The agent-skills home for the three generalized skills β€” an agent-skills issue opened at the start of
child 2 (the D4 dependency).