Skip to content

[analyze 2/3] persona-discoverer: built-in persona that clusters work into starter personas #76

@willwashburn

Description

@willwashburn

Part of the agentworkforce analyze feature. Issue 2 of 3. Consumes the JSON from #75 and produces the proposals that #77 will walk.

Depends on #71. This issue assumes the persona-kit migration (#64#71) has shipped. The file targets below reflect the post-migration layout.

Goal

Add a new internal built-in persona persona-discoverer that reads the gathered signal JSON (from #75) and emits a JSON proposals file describing 3–7 distinct starter personas, each grounded in a real cluster of work the repo has been doing.

The analyzer is a persona — not bespoke code — because clustering "ways of working" is a judgment task where heuristics produce shallow buckets, and because keeping the clustering logic in a persona JSON lets users iterate on it the same way they would persona-improver.

Files to touch

New:

  • personas/persona-discoverer.json — built-in persona spec.

Modify:

Persona shape

Pattern: copy the input/output contract style from personas/persona-improver.json, and the sparse systemPrompt + rich agentsMdContent style from personas/persona-maker.json.

  • id: persona-discoverer
  • intent: persona-discovery
  • tags: ["discovery", "planning"]
  • description: one or two plain sentences — what it does, when to use it.
  • tiers:
    • best: codex / gpt-5.3-codex / reasoning high / timeoutSeconds: 1200 / sandboxMode: workspace-write / workspaceWriteNetworkAccess: true
    • best-value: opencode / gpt-5-nano / reasoning medium / timeoutSeconds: 900
    • minimum: opencode / minimax-m2.5-free / reasoning low / timeoutSeconds: 600
    • Each tier's systemPrompt: "$TASK_DESCRIPTION" (matches persona-maker — the heavy spec lives in agentsMdContent).
  • inputs:
  • agentsMdContent: the operating spec (see below).

Output contract (what the persona writes to PROPOSALS_OUTPUT_PATH)

{
  "analysisInputPath": "<abs>",
  "proposals": [
    {
      "id": "kebab-case-id",
      "summary": "<= 80 chars, one line",
      "rationale": "Paragraph citing concrete signal from the analysis input: specific commits, files, PRs, sessions. No marketing.",
      "persona": { /* full PersonaSpec matching workload-router/src/index.ts */ }
    }
  ]
}

Each persona must validate against the existing PersonaSpec interface — id, intent, tags, description, skills, tiers.{best,best-value,minimum}, optional mount / permissions / inputs. If the persona declares skills, run npx skills find <kw> first (per the persona-maker spec) and only include skills that actually exist.

agentsMdContent outline

The spec should cover:

  • Read the analysis input. Specifically: walk commits, hotFiles, prs, codebase, sessions. Quote concrete signal (sha prefixes, file paths, PR numbers) in rationale.
  • Cluster shape. Name clusters by the work, not the code area. Good: "Database migration writer". Bad: "Person who touches src/db/".
  • Cluster count. 3–7. Fewer if the repo is small; more if there's clear separation. Don't pad to hit a number.
  • Conflict avoidance. Read TARGET_DIR for existing persona files; don't reuse those ids.
  • Skill curation. Run npx skills find <kw> for any skill before declaring it. Drop trivial single-flag CLIs.
  • Tier defaults. Codex@best, opencode@best-value, opencode@minimum — same shape as persona-maker. Override per-persona only when the work genuinely benefits.
  • Mount discipline. Use mount.readonlyPatterns to scope each persona to the directory cluster the signal pointed to — don't grant universal read/write.
  • Output discipline. Write the proposals JSON to PROPOSALS_OUTPUT_PATH and exit. Do not print proposals to stdout. Do not write any other files.
  • Anti-goals. Don't propose duplicates of persona-maker, persona-improver, persona-discoverer. Don't propose meta/management personas. Don't draft personas the signal doesn't actually support.

Tasks

  • Add 'persona-discovery' to PERSONA_INTENTS in workload-router/src/index.ts.
  • Author personas/persona-discoverer.json with the shape above. Cross-check against personas/persona-maker.json and personas/persona-improver.json for consistency.
  • Add a routing rule for persona-discovery in packages/workload-router/routing-profiles/default.json (mirror persona-improvement).
  • Run corepack pnpm --filter @agentworkforce/workload-router run dev once and confirm packages/workload-router/src/generated/personas.ts picks up the new built-in.
  • Validate the persona: npm run dev:cli -- agent persona-discoverer@best --dry-run — must pass.
  • Smoke test: write a minimal canned analysis JSON to /tmp/canned.json (2–3 fake commits, 1 PR, 1 package), invoke the persona headless with ANALYSIS_INPUT_PATH=/tmp/canned.json PROPOSALS_OUTPUT_PATH=/tmp/proposals.json TARGET_DIR=/tmp/personas, and verify the output file is valid JSON matching the contract above and that each proposed persona's JSON passes agentworkforce agent <written-persona>@best-value --dry-run.

Verification

  • corepack pnpm -r build clean.
  • corepack pnpm run check clean.
  • agentworkforce list shows persona-discoverer after the regenerated catalog ships.
  • agentworkforce show persona-discoverer@best prints the persona without warnings.
  • The smoke-test proposals JSON, when each persona is extracted and written under TARGET_DIR, all --dry-run cleanly.

Constraints

  • Built-in persona, not a persona pack. It lives under /personas/ like persona-maker and persona-improver and ships with the CLI; not under packages/personas-core/.
  • Model-agnostic prompt. agentsMdContent must not name specific models or hardcode tier identities — the same prompt drives all three tiers.
  • No new dependencies.
  • Don't write outside PROPOSALS_OUTPUT_PATH. The persona must not touch TARGET_DIR itself — that's [analyze 3/3] agentworkforce analyze: subcommand wiring + proposal walk + write to disk #77's job.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions