[analyze 2/3] persona-discoverer: built-in persona that clusters work into starter personas

Part of the `agentworkforce analyze` feature. **Issue 2 of 3.** Consumes the JSON from #75 and produces the proposals that #77 will walk.

**Depends on #71.** This issue assumes the persona-kit migration (#64–#71) has shipped. The file targets below reflect the post-migration layout.

## Goal

Add a new internal built-in persona `persona-discoverer` that reads the gathered signal JSON (from #75) and emits a JSON proposals file describing 3–7 distinct starter personas, each grounded in a real cluster of work the repo has been doing.

The analyzer is a persona — not bespoke code — because clustering "ways of working" is a judgment task where heuristics produce shallow buckets, and because keeping the clustering logic in a persona JSON lets users iterate on it the same way they would `persona-improver`.

## Files to touch

**New:**

- `personas/persona-discoverer.json` — built-in persona spec.

**Modify:**

- **`@agentworkforce/persona-kit`** — add `'persona-discovery'` to the `PERSONA_INTENTS` constant (or whatever the union of allowed intents is called in persona-kit post-#68). Persona types and the intent enum live here after the migration, not in workload-router.
- **`packages/workload-router/routing-profiles/default.json`** — add a routing rule for `persona-discovery` matching the `persona-improvement` rule's shape. Per #68's framing ("drop persona *types*, keep routing"), routing profiles stay in workload-router. Confirm before editing — if #68 moved routing profiles too, follow them.
- **Built-in persona catalog** — the script that turns `/personas/*.json` into a generated TS catalog (currently `packages/workload-router/scripts/generate-personas.mjs` producing `packages/workload-router/src/generated/personas.ts`) very likely moved into persona-kit during #65/#68. Find whichever script owns catalog generation post-migration, run it, and verify it picks `persona-discoverer.json` up automatically; update its intent filter if it has one. Do **not** hand-edit the generated file.

## Persona shape

Pattern: copy the **input/output contract** style from `personas/persona-improver.json`, and the **sparse `systemPrompt` + rich `agentsMdContent`** style from `personas/persona-maker.json`.

- `id`: `persona-discoverer`
- `intent`: `persona-discovery`
- `tags`: `["discovery", "planning"]`
- `description`: one or two plain sentences — what it does, when to use it.
- `tiers`:
  - `best`: codex / `gpt-5.3-codex` / reasoning `high` / `timeoutSeconds: 1200` / `sandboxMode: workspace-write` / `workspaceWriteNetworkAccess: true`
  - `best-value`: opencode / `gpt-5-nano` / reasoning `medium` / `timeoutSeconds: 900`
  - `minimum`: opencode / `minimax-m2.5-free` / reasoning `low` / `timeoutSeconds: 600`
  - Each tier's `systemPrompt`: `"$TASK_DESCRIPTION"` (matches persona-maker — the heavy spec lives in `agentsMdContent`).
- `inputs`:
  - `ANALYSIS_INPUT_PATH` — abs path to gather JSON (from #75).
  - `PROPOSALS_OUTPUT_PATH` — abs path the persona must write its proposals JSON to.
  - `TARGET_DIR` — abs dir where accepted personas will land (informs id-collision avoidance).
  - `TASK_DESCRIPTION` — optional, only set when invoked from `agentworkforce pick`.
- `agentsMdContent`: the operating spec (see below).

## Output contract (what the persona writes to `PROPOSALS_OUTPUT_PATH`)

```json
{
  "analysisInputPath": "<abs>",
  "proposals": [
    {
      "id": "kebab-case-id",
      "summary": "<= 80 chars, one line",
      "rationale": "Paragraph citing concrete signal from the analysis input: specific commits, files, PRs, sessions. No marketing.",
      "persona": { /* full PersonaSpec matching workload-router/src/index.ts */ }
    }
  ]
}
```

Each `persona` must validate against the existing `PersonaSpec` interface — `id`, `intent`, `tags`, `description`, `skills`, `tiers.{best,best-value,minimum}`, optional `mount` / `permissions` / `inputs`. If the persona declares skills, run `npx skills find <kw>` first (per the persona-maker spec) and only include skills that actually exist.

## `agentsMdContent` outline

The spec should cover:

- **Read the analysis input.** Specifically: walk `commits`, `hotFiles`, `prs`, `codebase`, `sessions`. Quote concrete signal (sha prefixes, file paths, PR numbers) in `rationale`.
- **Cluster shape.** Name clusters by *the work*, not the code area. Good: "Database migration writer". Bad: "Person who touches `src/db/`".
- **Cluster count.** 3–7. Fewer if the repo is small; more if there's clear separation. Don't pad to hit a number.
- **Conflict avoidance.** Read `TARGET_DIR` for existing persona files; don't reuse those ids.
- **Skill curation.** Run `npx skills find <kw>` for any skill before declaring it. Drop trivial single-flag CLIs.
- **Tier defaults.** Codex@best, opencode@best-value, opencode@minimum — same shape as persona-maker. Override per-persona only when the work genuinely benefits.
- **Mount discipline.** Use `mount.readonlyPatterns` to scope each persona to the directory cluster the signal pointed to — don't grant universal read/write.
- **Output discipline.** Write the proposals JSON to `PROPOSALS_OUTPUT_PATH` and exit. Do not print proposals to stdout. Do not write any other files.
- **Anti-goals.** Don't propose duplicates of `persona-maker`, `persona-improver`, `persona-discoverer`. Don't propose meta/management personas. Don't draft personas the signal doesn't actually support.

## Tasks

- [ ] Add `'persona-discovery'` to `PERSONA_INTENTS` in `workload-router/src/index.ts`.
- [ ] Author `personas/persona-discoverer.json` with the shape above. Cross-check against `personas/persona-maker.json` and `personas/persona-improver.json` for consistency.
- [ ] Add a routing rule for `persona-discovery` in `packages/workload-router/routing-profiles/default.json` (mirror `persona-improvement`).
- [ ] Run `corepack pnpm --filter @agentworkforce/workload-router run dev` once and confirm `packages/workload-router/src/generated/personas.ts` picks up the new built-in.
- [ ] Validate the persona: `npm run dev:cli -- agent persona-discoverer@best --dry-run` — must pass.
- [ ] Smoke test: write a minimal canned analysis JSON to `/tmp/canned.json` (2–3 fake commits, 1 PR, 1 package), invoke the persona headless with `ANALYSIS_INPUT_PATH=/tmp/canned.json PROPOSALS_OUTPUT_PATH=/tmp/proposals.json TARGET_DIR=/tmp/personas`, and verify the output file is valid JSON matching the contract above and that each proposed persona's JSON passes `agentworkforce agent <written-persona>@best-value --dry-run`.

## Verification

- [ ] `corepack pnpm -r build` clean.
- [ ] `corepack pnpm run check` clean.
- [ ] `agentworkforce list` shows `persona-discoverer` after the regenerated catalog ships.
- [ ] `agentworkforce show persona-discoverer@best` prints the persona without warnings.
- [ ] The smoke-test proposals JSON, when each persona is extracted and written under `TARGET_DIR`, all `--dry-run` cleanly.

## Constraints

- **Built-in persona, not a persona pack.** It lives under `/personas/` like `persona-maker` and `persona-improver` and ships with the CLI; not under `packages/personas-core/`.
- **Model-agnostic prompt.** `agentsMdContent` must not name specific models or hardcode tier identities — the same prompt drives all three tiers.
- **No new dependencies.**
- **Don't write outside `PROPOSALS_OUTPUT_PATH`.** The persona must not touch `TARGET_DIR` itself — that's #77's job.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[analyze 2/3] persona-discoverer: built-in persona that clusters work into starter personas #76

Goal

Files to touch

Persona shape

Output contract (what the persona writes to `PROPOSALS_OUTPUT_PATH`)

`agentsMdContent` outline

Tasks

Verification

Constraints

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[analyze 2/3] persona-discoverer: built-in persona that clusters work into starter personas #76

Description

Goal

Files to touch

Persona shape

Output contract (what the persona writes to PROPOSALS_OUTPUT_PATH)

agentsMdContent outline

Tasks

Verification

Constraints

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Output contract (what the persona writes to `PROPOSALS_OUTPUT_PATH`)

`agentsMdContent` outline