Pin an explicit model per agent; stop floating on GH_AW_MODEL_AGENT_COPILOT

## Problem

Five agents run bare `engine: copilot`, so their model floats on `GH_AW_MODEL_AGENT_COPILOT` (default `claude-sonnet-4.6`, consumer-overridable): `sdd-spec`, `sdd-triage`, `sdd-dispatch`, `sdd-validate`, `sdd-review`. [#248](https://github.com/norrietaylor/spectacles/pull/248) showed why floating is a footgun: a consumer's model var silently resolved `distillery-sync` to opus and blew the per-run effective-token cap (429). Any unpinned agent is one consumer var away from a silent re-model — cost and behavior both change without a diff anywhere in this repo.

The model is also a per-agent fit question, not a global default. The pipeline's cost shape is asymmetric: triage runs once per feature and its errors multiply downstream (one bad decomposition = N bad task PRs, [#252](https://github.com/norrietaylor/spectacles/issues/252)); execute/validate/review run per task.

## Proposal

Pin `engine: {id: copilot, model: ...}` per agent (the literal compiles into the lock and wins over consumer vars, per the [#248](https://github.com/norrietaylor/spectacles/pull/248) precedent):

| Agent | Pin | Rationale |
|---|---|---|
| `sdd-triage` | `claude-opus-4.6` | Heaviest reasoner (architecture, task DAG, assumption ledger, sizing). Runs once per feature; errors fan out multiplicatively. Best place in the system to pay for the top tier. |
| `sdd-spec` | `claude-sonnet-4.6` | Judgment-heavy authoring + the agile/full classifier, but bounded by human gates (spec PR review, `/approve`). Pin for classifier consistency across consumers. |
| `sdd-validate` | `claude-sonnet-4.6` | Structured verification against explicit R-IDs/proof artifacts. Guards merge — too important for haiku, too structured for opus. |
| `sdd-review` | `claude-sonnet-4.6` | One of three review nets (CodeRabbit, consumer CI). |
| `sdd-dispatch` | `claude-haiku-4.5` | Judgment moved to composite actions (`sdd-dispatch-compute`, dedupe, promote-ready, cycle-detect); remaining agent work is orchestration. Same logic as the `distillery-sync` pin. |
| `sdd-execute-{haiku,sonnet,opus}` | unchanged | Already tier-bound via `model:*` task label; verify the opus tier tracks the newest opus the Copilot proxy offers. |
| `distillery-sync` | unchanged | Pinned `claude-haiku-4.5` since [#248](https://github.com/norrietaylor/spectacles/pull/248). |

## Open questions

1. Keep a consumer override anywhere? Recommendation: hard-pin the mechanical agents; honor `GH_AW_MODEL_AGENT_COPILOT` only on `sdd-spec`/`sdd-review` (where model taste plausibly varies per consumer), and document that the rest are pinned by design.
2. `SDD_AUTO_MERGE=1` repos: `sdd-review` is the last line before unattended merge — worth an opus override knob (e.g. `SDD_REVIEW_MODEL`) or just documentation?
3. Confirm which model IDs the Copilot proxy currently serves before pinning (the opus tier may have a newer option than `claude-opus-4.6`).
4. Cost telemetry: the ADR 0020 OTLP spans carry token usage per agent — worth a baseline read before/after to validate the triage-up/dispatch-down trade nets near zero.

## Acceptance

- Every `engine:`-bearing agent either pins an explicit model or documents why it floats.
- A consumer setting `GH_AW_MODEL_AGENT_COPILOT` cannot re-model a hard-pinned agent (lock emits the literal).
- `docs/sdd/install.md` documents the per-agent models and any remaining override surface.

<sub>Filed from a model-fit review of all agents; informed by the consumer pilot run.</sub>

Agent	Pin	Rationale
`sdd-triage`	`claude-opus-4.6`	Heaviest reasoner (architecture, task DAG, assumption ledger, sizing). Runs once per feature; errors fan out multiplicatively. Best place in the system to pay for the top tier.
`sdd-spec`	`claude-sonnet-4.6`	Judgment-heavy authoring + the agile/full classifier, but bounded by human gates (spec PR review, `/approve`). Pin for classifier consistency across consumers.
`sdd-validate`	`claude-sonnet-4.6`	Structured verification against explicit R-IDs/proof artifacts. Guards merge — too important for haiku, too structured for opus.
`sdd-review`	`claude-sonnet-4.6`	One of three review nets (CodeRabbit, consumer CI).
`sdd-dispatch`	`claude-haiku-4.5`	Judgment moved to composite actions (`sdd-dispatch-compute`, dedupe, promote-ready, cycle-detect); remaining agent work is orchestration. Same logic as the `distillery-sync` pin.
`sdd-execute-{haiku,sonnet,opus}`	unchanged	Already tier-bound via `model:*` task label; verify the opus tier tracks the newest opus the Copilot proxy offers.
`distillery-sync`	unchanged	Pinned `claude-haiku-4.5` since #248.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pin an explicit model per agent; stop floating on GH_AW_MODEL_AGENT_COPILOT #269

Problem

Proposal

Open questions

Acceptance

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Pin an explicit model per agent; stop floating on GH_AW_MODEL_AGENT_COPILOT #269

Description

Problem

Proposal

Open questions

Acceptance

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions