Skip to content

deepakkamboj/pwagent

Repository files navigation

  ██████╗  ██╗    ██╗  █████╗   ██████╗ ███████╗███╗   ██╗████████╗
  ██╔══██╗ ██║    ██║ ██╔══██╗ ██╔════╝ ██╔════╝████╗  ██║╚══██╔══╝
  ██████╔╝ ██║ █╗ ██║ ███████║ ██║  ███╗█████╗  ██╔██╗ ██║   ██║
  ██╔═══╝  ██║███╗██║ ██╔══██║ ██║   ██║██╔══╝  ██║╚██╗██║   ██║
  ██║      ╚███╔███╔╝ ██║  ██║ ╚██████╔╝███████╗██║ ╚████║   ██║
  ╚═╝       ╚══╝╚══╝  ╚═╝  ╚═╝  ╚═════╝ ╚══════╝╚═╝  ╚═══╝   ╚═╝

  Multi-agent Playwright testing — Squad design, GitHub Copilot SDK runtime.
  cli (engine) · portal (dashboard) · scheduler (@bradygaster/squad-scheduler)

─────────────────────────────────────────────────────────────────

pwagent

pwagent = pw + agent — short for "playwright agent", mirroring the pw two-letter prefix used by Playwright's own packages.

Standalone CLI for multi-agent Playwright testing. Squad design, self-contained runtime, GitHub Copilot SDK.

Node TypeScript GitHub Copilot SDK Next.js Tailwind shadcn/ui Playwright Vitest Squad Status npm workspaces License

One install. One mental model. One config. Built like playwright and gh copilot — a single binary that carries its own agent runtime, scheduler, and provider client.


Table of contents


Setup

One-line install (Windows PowerShell)

iex "& { $(irm https://raw.githubusercontent.com/deepakkamboj/pwagent/main/install.ps1) }"

This clones the repo to ~/.pwagent/src, builds, and links pwagent globally. Requires Node.js 22+ and git — install them first if missing:

winget install OpenJS.NodeJS.LTS Git.Git

Manual install (dev / portal)

cd D:\gith\pwagent
npm install            # installs both cli/ and portal/ deps in one shot
npm run build          # builds both packages
npm link --workspace cli   # makes `pwagent` globally available from this checkout

The root package.json declares workspaces: ["cli", "portal"], so a single npm install at the root hoists shared deps and links the two packages. Per-package commands:

npm run build:cli      # cli only
npm run build:portal   # portal only
npm test               # cli vitest suite
npm run dev:portal     # portal in dev mode at http://127.0.0.1:7337

2. Copy the sample config

copy pwagent.config.example.json "$env:USERPROFILE\.pwagent\config.json"
notepad "$env:USERPROFILE\.pwagent\config.json"   # edit your ADO org + project

User state lives at ~/.pwagent/ (Windows: C:\Users\<you>\.pwagent\).

3. Verify prerequisites and install missing ones

pwagent prereqs                     # report only
pwagent prereqs --install           # install missing recommended (interactive)
pwagent prereqs --install --yes     # non-interactive — accept all

pwagent is a coordinator — most heavy lifting happens through other CLIs we shell out to. The full prereq matrix:

Tier Prereq Why pwagent needs it Auto-install
required node (≥22) Runtime for the binary itself manual (suggestion: nvm install 22 / winget install OpenJS.NodeJS.LTS)
required git Repo ops, patches, branching, PR prep winget / brew / apt / dnf / pacman
required gh Copilot SDK auth. Also: PR creation, Issues, repo discovery winget / brew / apt
required gh auth (logged in) Copilot subscription must be active runs gh auth login --web
required az ADO triage + PR creation; Kusto auth winget / brew / apt
required az pipelines extension Pipeline run details for triage az extension add --name azure-devops
required @axe-core/cli Accessibility scans (a11y verifier) npm i -g @axe-core/cli
required kusto CLI kusto skill, flake history manual (aka.ms/kustofree)
recommended @playwright/test validator / fixer / author npm i -g @playwright/test
recommended Playwright browsers headless Chromium / Firefox / WebKit npx playwright install
optional VS Code Only needed for the @pwagent chat wrapper winget / brew

Package-manager detection:

Platform Detected by Falls back to
Windows winget --versionwinget install manual link
macOS brew --versionbrew install manual link
Debian/Ubuntu apt -v + sudo available → sudo apt install manual link
Fedora/RHEL dnf --version + sudo available → sudo dnf install manual link
Arch pacman --version + sudo available → sudo pacman -S manual link
Node-globals always npm install -g n/a
gh extensions always gh extension install n/a

Safety rules:

  • No silent installs. Default is pwagent prereqs (report only). --install requires the flag.
  • No sudo without explicit consent. Linux installs show the exact sudo line and wait for confirmation.
  • Network-only when needed. Report mode makes no network calls — only which / --version probes.

4. Authenticate with GitHub

pwagent login           # wraps `gh auth login --web`
pwagent whoami          # verify Copilot is reachable

5. Initialise config

pwagent init            # interactive: model, ADO org/project, default repo
pwagent doctor          # composed prereq + config + provider + features view

Expected pwagent doctor output:

binary version    0.1.0
charters          10 (embedded)
skills            64 (embedded)
config            C:\Users\<you>\.pwagent\config.json    OK
provider          github-copilot-sdk (claude-sonnet-4.5)
copilot probe     [✓] Copilot SDK reachable (812ms)
prerequisites
  required        node (≥22) ✓ · git ✓ · gh ✓ · gh auth (logged in) ✓ · az ✓ · az pipelines ext ✓ · @axe-core/cli ✓ · kusto CLI ✓
  recommended     playwright ✓ · playwright browsers ✓
  optional        VS Code ✓
features
  test execution  available
  ADO triage      available
  ADO PRs         available
  GitHub PRs      available
  GitHub Issues   available
  a11y verify     available
  flake finder    available
  chat wrapper    available
scheduler         not running    (run: pwagent scheduler start)
Ready.

pwagent doctor --fix is an alias for pwagent prereqs --install --yes followed by re-verification.

6. (Optional) Run the portal

cd portal
npm install
npm run dev             # http://127.0.0.1:7337

What it is

pwagent is a multi-agent system for Playwright testing: triage failures, patch tests or product code, validate the fix, open the PR — all driven by Markdown charters that describe each specialist agent. It runs entirely on GitHub Copilot via @github/copilot-sdk — Microsoft-internal Copilot license powers the model calls; no external API keys.

It adopts every design pattern from Brady Gaster's Squad (charters, routing, reviewer gates, ceremonies, parallel-by-default, Scribe + Ralph) but ships its own runtime — no Copilot CLI host process, no squad.agent.md token tax per session. The 74 KB upstream coordinator manifest is compiled into the binary. Workspace overrides read from .pwagent/ first (our convention), falling back to .squad/ (Squad-scaffolded) for full upstream interop.

The current roster has 13 specialist agents:

Agent What it does
supervisor Top-level router — consults routing.md to pick the right specialist
discover Find failing tests from local runs, ADO, GitHub Actions, or Kusto; optional CI daemon (--watch)
triage Classify a failure: ProductBug / TestCodeBug / Environment / Inconclusive
analyze Read-only analyzer — coverage gaps (--scenarios), flake ranking (--flakes), test quality (--test-quality)
review HITL gate — operator stamps [p] / [t] / [s] / [o]
plan Build an ordered fix plan from failures.json or a scenario-gap report
fix Patcher (atomic --scope test|product) + full orchestrator (--orchestrate)
validate Run a test twice via npx playwright test; or axe-core before/after delta (--a11y)
publish Open PRs via ADO REST or gh pr create — never auto-merges
author New-test writer with 7-day probation window
auth Auth-flow specialist (storage state, multi-role, login retries)
record Canonical state writer — traceability matrix (--kind matrix) and fix patterns (--kind patterns)
report Weekly + ad-hoc reports (Markdown + HTML)

The monorepo (root package.json) ships three independent packages under npm workspaces, plus a docs site:

Package Path What it is
@pwagent/cli cli/ The standalone CLI binary and agent runtime
@pwagent/portal portal/ Local Next.js dashboard (port 7337) — Tailwind + shadcn/ui
@pwagent/docs docs/ End-to-end professional documentation (port 7338) — Nextra + Next.js

Removing any layer leaves the others working — three independent layers plus a static docs site.


Where pwagent runs (IDE / CLI compatibility)

pwagent is a plain Node CLI. Anything that can run a binary can run pwagent — there is no IDE plugin, no VS Code extension, no Copilot extension to install. We keep it that way to avoid lock-in.

Surface Status How it works
Standalone terminal (PowerShell, bash, zsh, Windows Terminal) ✓ first-class pwagent <args> from any shell
Claude Code CLI ✓ shell-out Run pwagent run triage --run-id X from inside a Claude Code session — Claude's bash tool invokes it like any other CLI. pwagent does not depend on Claude Code
Claude Code VS Code extension ✓ shell-out Extension's integrated terminal runs pwagent directly. No special integration needed
GitHub Copilot CLI ✓ shell-out Copilot CLI's shell can invoke pwagent. Independent: pwagent uses @github/copilot-sdk (the SDK, not the CLI) for its own model calls — the two coexist without conflict
GitHub Copilot VS Code extension ✓ shell-out The extension's integrated terminal runs pwagent like any other binary. The extension itself doesn't drive pwagent — they're peers, not nested
VS Code (any) ✓ tasks + terminal Add pwagent run commands to .vscode/tasks.json, or invoke from the integrated terminal
Cursor, Windsurf, JetBrains, other forks ✓ shell-out Same as VS Code — pwagent is a regular CLI
CI runners (GitHub Actions, ADO Pipelines) ✓ first-class pwagent runs on any CI runner with Node 22 + gh auth configured. Use GH_TOKEN env var for non-interactive auth

A thin VS Code chat wrapper is on the roadmap (a ~30-line shim that surfaces pwagent inside the chat panel) but is not part of v0.4. The CLI works fully without it.


Documentation site

@pwagent/docs is a Nextra-powered static documentation site that lives in the docs/ workspace and runs on port 7338 (the portal is 7337, so docs sits right next to it).

npm run dev:docs           # http://127.0.0.1:7338  (live reload)
npm run build:docs         # static build
npm run start:docs         # production server on 7338

The site covers 54 routes end-to-end:

  • Home (/), Getting Started (5 pages), Architecture (7 pages with Mermaid sequence diagrams)
  • Agents (one page per specialist — supervisor, triage, heal, generate, plan, scenario, report, validate, auth, review)
  • Skills, CLI Reference, Portal (6 pages — routes, Server Actions, SSE, auth, read-only mode, help link)
  • Scheduler, Configuration, Squad Design
  • Operations (5 pages — audit log, HITL review, ralph, service installer, troubleshooting)
  • Contributing, FAQ

The portal links to the docs from three places:

  1. Sidebar — a Help entry at the bottom (above the Collapse button) with an external-link indicator
  2. Header — a ? icon next to the bell/search buttons
  3. Footer — the docs link points at 127.0.0.1:7338 (can be overridden via config.portal.helpUrl for production deployments)

All three resolve the docs URL dynamically from window.location.hostname:7338, so the link works whether you access the portal at localhost, 127.0.0.1, or a network IP via --bind-all.


Design rationale

"Make pwagent work like playwright." Install it once with npm i -g @pwagent/cli. It carries its own agent runtime, its own scheduler, its own model client. It does not depend on gh copilot, does not require VS Code, does not install a Copilot plugin. If a workspace has a .pwagent/ (or .squad/) directory, pwagent picks up the charters and skills there as overrides — otherwise it uses the ones baked into the binary. The same binary runs interactively, runs unattended via the scheduler, and (optionally) backs a 30-line VS Code chat wrapper.

The team flagged complexity twice during design. This shape cuts to:

  • 1 binary the user installs.
  • 1 state directory (~/.pwagent/).
  • 1 config file.
  • 1 update path (npm update -g @pwagent/cli).

Everything else — chat surface, Copilot plugin shim, GitHub Actions integration — becomes a thin wrapper over the binary, written only if a user actually asks for it.

Reference tools the design copies

Reference tool What we copy
playwright One binary; sub-commands (test, codegen, show-trace); built-in runner; works anywhere Node runs; per-project config picked up if present
gh copilot One binary; talks to a model gateway; single gh auth step
gh itself Extensions via gh extension install; consistent UX across sub-commands
npx <pkg> Zero-install on-the-fly invocation as a fallback

What we win

  • All eleven Squad design benefits retained (see Squad principles (adopted)): charter-as-code, routing, reviewer gates, ceremonies, parallel-by-default spawn, response-mode selection, skill-aware spawn, append-only memory, casting, GitHub integration, Scribe + Ralph. Same filesystem layout, same mental model — different runtime.
  • One install, one mental model, one update path.
  • Works on any machine the team uses — corporate laptops, CI runners, air-gapped boxes, locked-down VDIs — provided gh auth is configured.
  • Scheduler is not a separate thing. Same process, same config, same logs.
  • VS Code becomes optional. No extension build required for v1.
  • Per-agent model choice. Pin Opus 4.7 for triage, Haiku 4.5 for the learner — declared in the charter's ## Model block, overridable per-invocation with --model.
  • No 74 KB coordinator manifest tax per session. The coordinator is compiled into the binary, not re-prompted on every turn.
  • Removable. npm uninstall -g @pwagent/cli && rm -rf ~/.pwagent/ — gone. No orphaned services, no extension leftovers, no Copilot plugin still registered.

Architecture

High-level components

flowchart TB
    User([Developer / CI runner])

    subgraph pwagent_CLI ["pwagent — single Node binary"]
        Cmd[Sub-command router<br/>commander.js]
        SquadHost[squad-host.ts<br/>scaffolds .pwagent/ → .squad/]

        subgraph Content ["Embedded content (baked in)"]
            Charters[13 charters<br/>agents/&lt;name&gt;/charter.md]
            Skills[60+ skills<br/>core · ci · pom · playwright-cli · kusto · ado · a11y]
            Routing[routing.md]
            Ceremonies[ceremonies.md]
            Team[team.md]
            Master[master-prompt.md]
        end

        subgraph CILoop ["pwagent run — CI/unattended path"]
            Coord[Coordinator<br/>routing + gates + ceremonies]
            Runtime[Agent runtime<br/>system prompt + tool loop]
            Tools[Tool sandbox<br/>read · write · edit · bash · grep]
            Provider[Copilot SDK adapter]
        end

        Sched[Scheduler<br/>in-process tick loop]
    end

    subgraph ChatChain ["Chat — pwagent (no args, TTY)"]
        SquadCLI["@bradygaster/squad-cli<br/>Ink TUI"]
        CopilotCLI[Squad shell<br/>banner · @ routing · suggestion box · streaming]
    end

    Workspace[(Workspace root<br/>.pwagent/  ← canonical<br/>.squad/    ← auto mirror)]
    Portal["@pwagent/portal<br/>(Next.js 15, port 7337)"]
    StateDir[("~/.pwagent/<br/>config.json · scheduler/ · logs/ · audit/")]

    User -->|"pwagent (no args, TTY)"| SquadHost
    User -->|"pwagent run <agent>"| Cmd
    User -->|browser| Portal

    SquadHost -->|"scaffold + mirror"| Workspace
    SquadHost -->|"spawn"| SquadCLI
    SquadCLI -->|"reads .squad/"| Workspace
    SquadCLI -->|"Ink TUI"| CopilotCLI

    Cmd --> Coord
    Cmd --> Sched
    Coord --> Runtime
    Coord --> Charters
    Runtime --> Skills
    Runtime --> Routing
    Runtime --> Ceremonies
    Runtime --> Master
    Runtime --> Tools
    Runtime --> Provider
    Sched --> Coord

    Provider --> CopilotSDK[("@github/copilot-sdk")]
    CopilotSDK -->|gh auth| GitHub

    Cmd <--> StateDir
    Sched <--> StateDir
    Portal --> StateDir
    Portal --> Charters
Loading

Two daily-driver flows, one shared content base:

  • Chat (interactive)pwagent spawns @bradygaster/squad-cli, which renders an Ink TUI (banner, agent roster, @agent routing, suggestion box, slash commands). .pwagent/.squad/ mirror feeds Squad at startup.
  • CI (unattended)pwagent run <agent> uses our own coordinator + @github/copilot-sdk directly; no Squad dependency on CI runners.

Same 13 agents, same skills, same routing — different invocation surfaces.

Hard rules

  • Squad design, our runtime. We adopt the eleven Squad benefits verbatim as filesystem conventions and runtime behaviours. We do not load the upstream Squad coordinator manifest at run time — the coordinator logic is a module inside pwagent.
  • Embedded by default. All 13 charters and 60+ skill guides ship inside the binary. Works zero-config in any directory.
  • Workspace overrides win. If cwd/.pwagent/agents/triage/charter.md exists, it overrides the embedded triage charter for that invocation. Source-controlled customisation without forking the binary.
  • Scheduler is a sub-command, not a separate daemon. Same logs, same config, same process.
  • Three independent layers. CLI, portal, and scheduler each function alone; removing any one leaves the others working.

Charter / skill resolution order

1. embedded   →  <dist>/content/agents/<name>/charter.md          (shipped in the binary)
2. user       →  ~/.pwagent/agents/<name>.md                       (per-machine override)
3. workspace  →  <cwd>/.squad/agents/<name>/charter.md             (Squad-scaffolded — fallback)
4. workspace  →  <cwd>/.pwagent/agents/<name>/charter.md           (preferred convention — wins)

Same chain for skills under */skills/. Loaders are in cli/src/charters/loader.ts and cli/src/skills/loader.ts.


What Squad is (brief primer)

Skip this section if you already know Squad. Otherwise: Squad is a multi-agent orchestration framework by Brady Gaster (github.com/bradygaster/squad) that runs inside GitHub Copilot CLI. It turns a single Copilot session into a team of specialised AI agents that live as Markdown files in your repository.

In one sentence:

Squad is a coordinator prompt + a filesystem convention (.squad/) + a CLI (@bradygaster/squad-cli) that lets you describe an AI team in version-controlled Markdown and then route work to its members through a Copilot CLI session.

What Squad is, mechanically

Layer What it is Where it lives
Coordinator prompt One large agent manifest (squad.agent.md) that Copilot CLI loads as the "Squad" agent .github/agents/squad.agent.md
Filesystem state A .squad/ directory of Markdown / JSON files describing the team, routing, decisions, ceremonies, casting, skills .squad/
Per-agent charters One Markdown file per agent describing identity, responsibilities, boundaries .squad/agents/<name>/charter.md
CLI @bradygaster/squad-cli for init, watch, and migration commands npm global install
GitHub workflows Optional Actions that mirror .squad/team.md into labels and auto-triage issues .github/workflows/squad-*.yml

What Squad is NOT

  • NOT a scheduler. Squad has no cron, no daemon, no in-process timer.
  • NOT a code framework. There is no SDK, no API, no library to import. The "code" is Markdown.
  • NOT a hosted service. Everything runs locally inside your Copilot CLI session.
  • NOT a runtime by itself. Without Copilot CLI it's inert documentation.

pwagent keeps Squad's design and Markdown file formats but replaces Copilot CLI with its own embedded runtime. That is what makes it independently installable and CI/air-gap-capable on day one.


Squad principles (adopted)

pwagent is Squad's design with our runtime. Each of the eleven Squad benefits ports into the binary as a first-class module or filesystem convention.

# Squad benefit How pwagent provides it
1 Charter-as-code — one Markdown file per agent with stable Identity / Responsibilities / Boundaries / Tools / Model sections Same format, same paths. .pwagent/agents/<name>/charter.md (or .squad/...) either embedded or overridden. pwagent agents show <name> renders one.
2 Routing instead of prompt-stuffing ~/.pwagent/routing.md (or workspace) — coordinator consults the table on every user utterance.
3 Reviewer gates — declarative QA A gates: table in routing.md says which artefacts require which reviewer. Runtime refuses to spawn heal without a triage stamp.
4 Persistent identity via casting ~/.pwagent/casting/registry.json — opt-in. Off by default (keeps admin-tool searchability).
5 Append-only memory with merge=union Same .gitattributes snippet. decisions.md, agents/*/history.md, log/, orchestration-log/ all concat on merge.
6 Ceremonies — auto-run agendas ceremonies.md declares them; the coordinator runs the agenda when the condition matches.
7 Parallel-by-default execution Coordinator spawner takes independent sub-tasks and dispatches them concurrently via Promise.all against the Copilot SDK.
8 Response Mode Selection — Direct / Lightweight / Standard / Full pwagent run --mode=light (or coordinator-chosen). Direct skips spawn; Full runs ceremonies + Scribe.
9 Skills with confidence lifecycle Skill-aware spawn injects read .pwagent/skills/<x>.md before starting into the spawn prompt. Confidence in ~/.pwagent/skills/<name>/.confidence.
10 GitHub integration The four .github/workflows/squad-*.yml workflows from upstream Squad still parse ## Members literally from team.md.
11 Free Scribe + Ralph Built into the binary. Scribe writes to .pwagent/log/. Ralph is pwagent ralph go / status / stop.

Response Mode Selection (cost / latency tuning)

For every user turn the coordinator picks a mode:

Mode When Target latency Spawn
Direct Status checks, factual answers from context ~2-3s None
Lightweight Single-file edits, small fixes ~8-12s 1 agent, minimal prompt
Standard Normal tasks, full ceremony ~25-35s 1 agent, full context
Full Multi-agent, "Team" requests ~40-60s Parallel fan-out + Scribe

Three layers of "keep working"

Squad describes three ways an agent team keeps moving without a user typing. pwagent ships equivalents:

Layer Squad source pwagent equivalent
L1 — In-session loop Ralph built into the coordinator manifest pwagent ralph go / status / stop
L2 — Local watchdog npx github:bradygaster/squad watch pwagent scheduler start (see Scheduler)
L3 — Cloud heartbeat .github/workflows/squad-heartbeat.yml Identical workflow, calls pwagent instead of opening a Copilot session

What we deliberately don't adopt

Squad piece Why we drop it
The 74 KB upstream squad.agent.md coordinator manifest Embedded as compiled logic inside the binary. No per-turn token cost. Tracked as a reference only — never run as a prompt.
@bradygaster/squad-cli as a runtime dependency pwagent init replaces its init/watch/migration commands. We still recognise the .squad/ filesystem format Squad produces so workspaces scaffolded by npx @bradygaster/squad-cli init work without modification.
Squad's @copilot coding-agent roster member Out of scope for v1. If needed later, a charter like any other.

Technology stack

Layer Choice Why
Runtime Node 22+ Required by @github/copilot-sdk (uses built-in node:sqlite)
Language TypeScript 5.7, strict Catches charter/config drift at build time
CLI framework commander Stable, low-deps, handles sub-commands well
Provider SDK @github/copilot-sdk v0.3 Microsoft-internal Copilot license; no API keys
Validation zod Config schema + safe set paths
Frontmatter gray-matter Reads charter name / description headers
Process spawn execa Used for prereq detection + install flow
Prompts prompts pwagent init interactive flow
Colours picocolors Tiny, no deps
Tests Vitest + isolated tmp PWAGENT_HOME Fast forks
Portal framework Next.js 15 App Router + React 19 SSR + RSC for streaming
Portal UI shadcn/ui on Tailwind 3.4 Hand-written components in portal/components/ui/
Icons lucide-react Consistent with shadcn ecosystem

Usage

Daily driver — pwagent opens the Squad chat shell

pwagent           # spawns @bradygaster/squad-cli (Ink TUI)
                  # with pwagent's 13 agents auto-loaded from .squad/

That's it. The Squad TUI — PWAGENT banner, agent roster (categorised, vertical), @agent routing, / slash-command menu with suggestion box, streaming responses.

Under the hood, pwagent (no args, TTY) does two things before handing off:

  1. Lazy-scaffold .pwagent/ from embedded charters (only if missing; never overwrites your customisations), then mirror to .squad/.
  2. Spawn @bradygaster/squad-cli with SQUAD_BRAND_* env vars so the TUI shows the PWAGENT identity.

Bootstrap (one time)

pwagent init [--yes]
pwagent login                       # gh auth login --web (Copilot SDK)
pwagent doctor                      # verify prereqs + auth + SDK reachability
pwagent prereqs --install --yes     # install missing prereqs (gh, az, axe, kusto)

These all also work as slash commands inside chat: /init, /login, /doctor, /doctor --fix.

Inspection

pwagent agents list
pwagent agents show <name>           # e.g. triage, fix, validate, supervisor
pwagent agents add <path>
pwagent skills list [--pack core|ci|pom|playwright-cli|kusto|ado|a11y]
pwagent skills show <pack>/<name>    # e.g. core/locators, ado

Inside chat: /agents, /skills, /help <agent>.

Config + model

pwagent config view | get <path> | set <path> <val> | path
pwagent model list | show | set <id> [--agent <agent>] | reset

Inside chat: /model <id>.

CI / unattended — pwagent run (not the daily-driver path)

pwagent run <agent> is kept for CI runners and scheduled jobs — same coordinator + SDK as chat, but headless. Use it from GitHub Actions, ADO Pipelines, the scheduler in ~/.pwagent/scheduler/, or any non-interactive context.

# CI mode — fix everything red without a human in the loop
pwagent run fix --orchestrate --ado-pipeline 23878 --auto-stamp --json

# Scheduler job spec
{
  "command": "pwagent run fix --orchestrate --ado-pipeline 23878 --max-failures 5 --auto-stamp",
  "schedule": { "type": "cron", "cron": "*/15 9-17 * * 1-5" }
}

Full flags:

pwagent run <agent> [prompt...] [--model <id>] [--mode direct|light|standard|full] \
                                 [--cwd <path>] [--dry-run] [--json] [--debug] \
                                 [--connect-timeout-s <n>] [--idle-timeout-s <n>]

For day-to-day human use, just type pwagent and use slash commands.

Other entry points

pwagent review                       # interactive HITL stamp loop
pwagent ralph go | status | stop     # in-session driver (Squad-style)

# scheduler — powered by @bradygaster/squad-scheduler; config in squad.schedule.json
pwagent scheduler start              # start the scheduler (reads squad.schedule.json)
pwagent scheduler stop               # signal a running scheduler to stop
pwagent scheduler list               # list jobs + next fire time
pwagent scheduler status [<id>]      # overall status or detail for one job
pwagent scheduler logs <id>          # tail JSONL event log for a job

# portal — local Next.js dashboard
pwagent portal start [--dev] [--port <n>] [--read-only] [--bind-all]
pwagent portal status [--port <n>]

# audit
pwagent audit tail [-n <limit>]
pwagent audit export [--since 7d] [--type <t>] [--agent <a>] [--format jsonl|json|table] [-o <file>]

# service install (platform-native unattended scheduler)
pwagent service install | uninstall | status

Agents and their arguments

pwagent ships 13 specialist agents. Multi-purpose agents specialize via flags (fix --scope test|product, validate --test|--a11y, discover --watch, etc.) — fewer charters, sharper composition.

Agent Purpose Key arguments
supervisor Top-level router (default when no agent named) (none — invoked by pwagent run "<prompt>" without an agent)
discover Find failing tests; optional CI daemon --source local|ado|github|kusto · --pipeline <id> · --build <id> · --run-id <id> · --window 7d · --watch · --poll-seconds 300 · --max-dispatch 10 · --status · --stop
triage Classify failures (ProductBug / TestCodeBug / Environment / Inconclusive) --run-id <id> · --artifact <path> · --example (canned fixture)
analyze Read-only analyzer; three orthogonal modes --scenarios [--path <dir>] [--min-coverage N] [--fail-on-critical] · --flakes --pipeline <id> [--top N] [--window <dur>] [--format json|csv] · --test-quality --files <glob> [--severity-min Low|Medium|High|Critical] [--file-bug] [--pr-comment <pr-id>]
review HITL stamp gate; pause until human approves (interactive) · --list · --batch < stamps.txt
plan Build an ordered fix plan --failures <path> · --from-scenario · --from-triage <id>
fix Patcher (atomic) + orchestrator Scope (atomic): --scope test|product|auto · --plan <path> --test <name> · --from-triage <id> · --bug AB#<id> · --diff-only · --skip-gate (audited). Orchestrate (full chain): --orchestrate --ado-pipeline <id> · --orchestrate --ado-build <id> · --orchestrate --bug AB#<id> · --orchestrate --bugs --top N --area <path> · --max-failures 25 · --auto-stamp (audited) · --bundle-pr
validate Run something twice; report delta --test <file> [--repeat N] [--grep <pat>] [--project <name>] · --a11y --bug AB#<id> [--url <url>]
publish Open PR via REST (ADO) or gh (GitHub) --branch <name> · --target <branch> · --bug AB#<id> · --results <path> · --draft · --reviewer @user · --allow-large-pr
author New-test writer with 7-day probation --scenario "<text>" · --from-gap ScenarioGap-<id> · --coverage-gap <path> · --cwd <path>
auth Auth-flow specialist --add-role <name> · --refresh-state <role> · --diagnose --trace <path> · free-text
record Canonical-state writer (two kinds) Matrix: --kind matrix --op import|sync|link|query|decide|stamp|gap plus per-op flags (--bug-ids, --tests <glob>, --bug + --test, --verdict, --confidence, --rationale, --stamp p|t|s|o, --operator, --gap, --severity). Patterns: --kind patterns --from <fix-results.json>
report Weekly + ad-hoc reports --window 7d|30d · --since <date> --until <date> · --kind weekly|flake-rank|triage|hitl-audit|scenario-coverage|test-health|self-health · --commit (commits to repo)

Global flags that work on every agent invocation:

Flag Meaning
--model <id> Override the charter's preferred model for this call
--mode direct|light|standard|full Force a response mode (skip the coordinator's inference)
--cwd <path> Resolve workspace overrides from this directory
--dry-run Resolve charter + skills + tools, print system message, do not call the SDK
--json Stream JSON events to stdout instead of markdown
--skills <a,b,c> Replace skill-aware inference with this explicit set
--tool-timeout-s <n> Per-tool timeout (default 120)
--idle-timeout-s <n> SDK session idle timeout (default per mode)
--skip-gate Bypass reviewer gates (recorded in audit as gateSkipped: true)

The canonical chain — fix --orchestrate

discover (--source ado|github|local|kusto)
  → triage (parallel fan-out, one per failure)
    → review (HITL serial gate; skip with --auto-stamp)
      → plan
        → fix --scope <test|product>  (parallel fan-out, one per plan entry)
          → validate --test            (twice — gate two greens)
            → publish                  (one PR per group)
              → record --kind matrix   (link bug ↔ test ↔ verdict ↔ stamp)

validate --a11y runs alongside validate --test when accessibility is in scope. record --kind patterns runs after the PR merges. report runs on a schedule and reads the matrix + audit log.

Quick examples per agent

# Full chain — fix everything red in an ADO pipeline
pwagent run fix --orchestrate --ado-pipeline 23878 --auto-stamp

# Daemon — poll ADO + GitHub Actions, dispatch triage on new failures
pwagent run discover --watch --poll-seconds 300

# One-shot discover from Kusto
pwagent run discover --source kusto --pipeline 23878 --window 7d

# Atomic test-side fix from a stamped plan
pwagent run fix --scope test --plan ./fix-plan.json --test "login should redirect"

# Run a test twice
pwagent run validate --test tests/login.spec.ts --repeat 2

# axe-core before/after delta on an ADO bug
pwagent run validate --a11y --bug AB#54321

# Grade test code quality
pwagent run analyze --test-quality --files "tests/**/*.spec.ts" --severity-min High

# Top-10 flakes via Kusto
pwagent run analyze --flakes --pipeline 23878 --top 10 --window 30d

# Author a new test from a free-text scenario
pwagent run author --scenario "logged-in user applies a coupon and removes it"

# Import bugs into the traceability matrix
pwagent run record --kind matrix --op import --source ado --bug-ids 12345,12346

# Extract reusable patterns from a verified fix
pwagent run record --kind patterns --from ./fix-results.json

Full examples and end-to-end workflows live in USAGE.md.

Interactive chat — pwagent opens the Squad shell

pwagent (no args, TTY) spawns @bradygaster/squad-cli, which renders an Ink TUI — PWAGENT banner, categorised agent roster, @agent routing with suggestion box, / slash-command autocomplete, and streaming responses. We don't build a custom REPL: squad-cli is already one, and Squad wires our 13 agents into it.

pwagent                # opens Squad TUI

On first run in a workspace: pwagent lazy-scaffolds .pwagent/ from its embedded charters, then mirrors it to .squad/ (Squad reads that path):

.pwagent/                        ← canonical — edit here
├── agents/                      ← 13 charters (analyze, auth, author, discover, fix, ...)
├── skills/                      ← 60+ skill guides
├── routing.md
├── team.md
├── ceremonies.md
└── master-prompt.md

.squad/                          ← auto-generated mirror — add to .gitignore

Subsequent runs rebuild .squad/ fresh from .pwagent/ — your changes in .pwagent/ always propagate.

Inside the chat shell, type free text (the supervisor routes it) or address a specialist directly:

› fix everything red in pipeline 23878
› @pwagent-fix --orchestrate --ado-pipeline 23878
› @pwagent-triage --run-id 89211

Slash commands (/status, /agents, /history, /clear, /help, /quit) are built-in. Type / or @ to open the suggestion box.

For CI / scripted invocations, the pwagent run command stays — same coordinator, same SDK, headless:

pwagent run fix --orchestrate --ado-pipeline 23878 --auto-stamp --json

Coordinator runtime — what pwagent run actually does

  • Resolves charter (workspace > user > embedded).
  • Picks the model (charter ## Model > config.perAgent > default).
  • Filters tools from the charter's ## Tools block.
  • Injects skill references via keyword scoring (skill-aware spawn).
  • Prepends the master prompt (cli/src/content/master-prompt.md) which encodes the cross-cutting Squad rules (reviewer gates, response modes, parallel-by-default, formatting, audit). Equivalent to upstream Squad's squad.agent.md but compiled into the binary.
  • Calls @github/copilot-sdk (streaming, tool loop, idle timeout) and emits the result.

pwagent run without naming an agent (or via pwagent ralph go) spawns the supervisor, which consults routing.md to pick the right specialist. The routing.md parser lives at cli/src/runtime/routingTable.ts.

Tool sandbox

Tool Notes
read Read text files
write Write text files (creates dirs)
edit Exact-match string replacement
bash Binary allowlist: git, gh, az, npx, npm, node, pwsh, powershell, bash, sh, kusto.cli, axe. Output capped at 200 KB; default timeout 120s.
grep rg-aware (prefers ripgrep; falls back to grep)

Audit

JSONL append-only stream at ~/.pwagent/audit/events.jsonl. Lifecycle vocab: run.start, run.complete, run.error, tool.invoke, tool.error, review.stamp, scheduler.start, scheduler.stop.

Provider note — what we use, what we don't

pwagent runs only on @github/copilot-sdk (auth via gh). Charters declare model preferences (e.g. claude-sonnet-4.5, claude-haiku-4-5); per-invocation override is --model. No direct Anthropic / Bedrock / OpenAI SDK calls. No external API keys to manage.


Portal

The portal is a separate Next.js process. It is independent of the CLI — kill the portal and the CLI + scheduler keep working; kill the scheduler and the portal still shows historical state.

Why a separate portal

The CLI gives you imperative control. It is great for doing one thing. It is bad for seeing:

  • Tailing five jobs at once
  • Skimming yesterday's weekly report
  • Editing schedules without hand-rolling JSON
  • Cross-referencing a failed run to its triage verdict and the eventual PR

Goals and non-goals

Goal Why
Single URL to see everything pwagent is doing Operators want one place to look
Live log tail (per job, per agent) Debugging unattended runs
Edit scheduler configs through a form (not JSON) Lower barrier for non-engineers
Trigger one-off runs from a button Faster than shelling out
Render reports (Markdown + HTML) inline The weekly report is the audience-facing artifact
Audit-log viewer with filters Compliance, "what ran last Tuesday"
Non-goal Why not
Hosted multi-tenant SaaS Local-only by design
Replace the CLI for power users The CLI is faster for scripted work
Multi-user accounts / RBAC One user per machine; auth only as a loopback guard
Live edit charters Charters are source-controlled; UI is read-only for those

Architecture

flowchart TB
    Browser([Browser at http://127.0.0.1:7337])
    Browser --> Next[Next.js 15 App Router<br/>process: pwagent-portal]

    subgraph Next [Next.js portal &mdash; one process]
      Pages[Server components<br/>SSR pages]
      API[Route handlers<br/>/api/*]
      Watch[File watcher<br/>chokidar]
      SSE[Server-Sent Events bridge]
    end

    Pages --> FS1[(~/.pwagent/scheduler/*.json<br/>job specs)]
    Pages --> FS2[(~/.pwagent/logs/*.jsonl<br/>lifecycle events)]
    Pages --> FS3[(dist/content/agents/*/charter.md<br/>~/.pwagent/audit/*.jsonl)]
    Pages --> FS4[(reports/*.md, *.html<br/>weekly digests)]

    API -->|write| FS1
    API -->|spawn| CLI[pwagent run / pwagent scheduler ...]

    Watch --> FS1
    Watch --> FS2
    Watch --> SSE
    SSE --> Browser
Loading

Stack:

  • Next.js 15 App Router — file-based routing, server components, streaming.
  • Tailwind CSS + shadcn/ui for fast UI without designing from scratch.
  • Server Actions for form writes (enable/disable jobs, edit schedules).
  • Server-Sent Events (not WebSocket — simpler, fine for one-way log streaming).
  • No database. All state is files on disk. SSR reads them on each request; SSE pushes deltas.

Port allocation

Fixed default: http://127.0.0.1:7337. Memorable (7337 ≈ "leet"), outside common dev ports (3000/3001/8080), bound to 127.0.0.1 only.

pwagent portal start                          # http://127.0.0.1:7337
pwagent portal start --port 3737              # alternative
PWAGENT_PORTAL_PORT=9999 pwagent portal start

Routes

/                          Dashboard overview — active jobs, last 24h, HITL queue, latest report
/jobs                      Scheduler jobs table with enable/disable + new-job form
/jobs/[id]                 Job detail — spec viewer + live SSE event tail
/audit                     Audit log viewer with type / agent / time filters + export
/config                    Form editor for ~/.pwagent/config.json with diff preview + atomic save
/agents                    Charter browser (10 cards)
/skills                    Skill browser (60+ guides)
/playwright-cli            Playwright CLI command reference

Auth + safety (v0.4)

  • Bearer auth — per-install secret at ~/.pwagent/portal/secret (chmod 600). HttpOnly cookie pwagent_session; every Server Action calls requireWriteAuth().
  • Loopback enforcementportal/middleware.ts rejects non-loopback hosts + X-Forwarded-* headers. --bind-all flag opts out behind a trusted proxy.
  • --read-onlypwagent portal start --read-only sets PWAGENT_PORTAL_READ_ONLY=1; Server Actions short-circuit; banner shown across pages.

Scheduler

Powered by the standalone @bradygaster/squad-scheduler package. Jobs are declared in squad.schedule.json at the project root (not in ~/.pwagent/).

Job spec format

// squad.schedule.json
{
  "jobs": [
    {
      "id": "daily-triage",
      "name": "Daily triage",
      "description": "Run the triage agent on weekday mornings",
      "cron": "0 9 * * 1-5",
      "agent": "triage",
      "args": "--pipeline 23878",
      "enabled": true,
      "maxRunSeconds": 300,
      "retryOnFailure": true,
      "maxRetries": 2,
      "retryBackoffSeconds": 30,
      "disableAfterFailures": 5,
      "runOnStartup": false
    },
    {
      "id": "weekly-report",
      "name": "Weekly report",
      "description": "Generate weekly Markdown + HTML report",
      "cron": "0 17 * * 5",
      "command": "node scripts/report.mjs",
      "enabled": true,
      "maxRunSeconds": 600
    }
  ]
}
  • agent jobs run squad @<agent> <args> (e.g. agent: "triage"squad @triage --pipeline 23878).
  • command jobs run an arbitrary shell command.
  • cron uses standard 5-field syntax: minute hour dom month dow.

Sample jobs

Job id Cron Purpose
daily-triage 0 9 * * 1-5 Run triage agent on weekday mornings
hourly-flake-check 0 * * * * Scan for newly flaky tests every hour
weekly-report 0 17 * * 5 Generate weekly Markdown + HTML report

Tick loop guarantees

  • 5-second tick interval.
  • Per-job locks prevent overlapping runs.
  • Atomic ~/.pwagent/scheduler/state.json writes.
  • JSONL event stream at ~/.pwagent/scheduler/events/<id>.jsonl — event types: job_start, job_end, job_error, job_timeout, job_retry, job_auto_disabled, scheduler_start, scheduler_stop.
  • Hot-reloads squad.schedule.json on file change.
  • Retry with configurable backoff; auto-disable after disableAfterFailures consecutive failures.

Commands

pwagent scheduler start          # start the scheduler daemon
pwagent scheduler stop           # stop the scheduler daemon
pwagent scheduler list           # list all jobs and their next-run times
pwagent scheduler status [id]    # show overall status or details for one job
pwagent scheduler logs <id>      # tail the JSONL event log for a job

Platform service installer

pwagent service install registers the scheduler with the OS service manager:

Platform Mechanism
Windows Task Scheduler XML (schtasks /Create)
macOS launchd plist (launchctl load)
Linux systemd --user unit (systemctl --user enable --now)

pwagent service uninstall removes it. pwagent service status reports state.


Repository layout

pwagent/                          ← root: npm workspaces wrapper, private
├── package.json                  ← workspaces: ["cli", "portal"], top-level scripts
├── pwagent.config.example.json   ← sample config (copy to ~/.pwagent/config.json)
├── README.md
├── cli/                          ← @pwagent/cli (publishable)
│   ├── package.json              ← bin: pwagent
│   ├── tsconfig.json
│   ├── vitest.config.ts
│   ├── eslint.config.js
│   ├── scripts/copy-content.mjs  ← copies src/content → dist/content after tsc
│   ├── src/
│   │   ├── index.ts              ← CLI entrypoint (commander)
│   │   ├── cli/                  ← init, auth, doctor, prereqs, agents, skills, model, config, run, scheduler, portal, ralph, review, audit, service
│   │   ├── config/               ← zod schema + loader (atomic writes)
│   │   ├── charters/loader.ts    ← frontmatter parsing, override chain
│   │   ├── skills/loader.ts      ← pack/name + Squad <name>/SKILL.md shape
│   │   ├── prereqs/              ← matrix, detection, install flow
│   │   ├── runtime/              ← provider (Copilot SDK), coordinator, tools, routingTable
│   │   ├── audit/                ← JSONL writer + reader
│   │   ├── content/              ← embedded into the binary
│   │   │   ├── agents/           ← 13 charters: supervisor, discover, triage, analyze, review, plan, fix, validate, publish, author, auth, record, report
│   │   │   ├── skills/
│   │   │   │   ├── core/         ← Playwright core guides (locators, assertions, fixtures, …)
│   │   │   │   ├── ci/           ← CI/CD guides
│   │   │   │   ├── pom/          ← Page Object Model
│   │   │   │   ├── playwright-cli/  ← codegen, traces, devices
│   │   │   │   ├── kusto/SKILL.md   ← Kusto queries (Squad-shape)
│   │   │   │   ├── ado/SKILL.md     ← ADO work items + PRs (Squad-shape)
│   │   │   │   └── a11y/SKILL.md    ← Accessibility (Squad-shape)
│   │   │   ├── master-prompt.md
│   │   │   ├── routing.md
│   │   │   ├── ceremonies.md
│   │   │   ├── team.md
│   │   │   └── config.example.json
│   │   └── utils/                ← paths, colours, files, prompts, banner
│   └── tests/                    ← vitest suite
└── portal/                       ← @pwagent/portal (private)
    ├── package.json
    ├── middleware.ts             ← loopback enforcement + X-Forwarded-* rejection
    ├── app/                      ← Next.js 15 App Router
    │   ├── api/                  ← /api/jobs, /api/jobs/[id], /api/events/jobs/[id] (SSE), /api/audit/export
    │   ├── jobs/                 ← jobs page + jobs/[id]/page.tsx (with live SSE tail)
    │   ├── audit/                ← audit log viewer
    │   ├── config/               ← form editor with diff preview
    │   └── …
    ├── components/
    │   ├── sidebar.tsx           ← SidebarProvider + Sidebar + SidebarInset
    │   ├── header.tsx            ← breadcrumb + actions
    │   ├── footer.tsx            ← version + scheduler status
    │   └── ui/                   ← shadcn/ui — button, card, separator, tooltip
    └── lib/                      ← jobs, paths, charter helpers, auth

Workspace override directories

When a user runs pwagent inside a repo, charter / skill / routing overrides resolve in this order (later wins):

1. <dist>/content/                 (embedded — baked into the binary)
2. ~/.pwagent/                     (per-machine user override)
3. <cwd>/.squad/                   (Squad-scaffolded — for `npx squad init` workspaces)
4. <cwd>/.pwagent/                 (preferred workspace convention — wins)

This makes pwagent fully compatible with workspaces scaffolded by npx @bradygaster/squad-cli init while preferring our .pwagent/ directory when users build their own.


Sequence diagrams

Chat launch — pwagent opens the Squad shell

sequenceDiagram
    actor User
    participant Term as Terminal
    participant Bin as pwagent binary
    participant FS as filesystem
    participant Squad as @bradygaster/squad-cli (Ink TUI)

    User->>Term: pwagent
    Term->>Bin: argv=["pwagent"]; stdin.isTTY=true
    Bin->>Bin: route to squad-host.ts

    Bin->>FS: exists(.pwagent/)?
    alt missing — first run
        Bin->>FS: cp embedded/{agents,skills,routing.md,...} → .pwagent/
        Bin-->>Term: "scaffolded .pwagent/"
    end

    Bin->>FS: rm -rf .squad/
    Bin->>FS: cp .pwagent/ → .squad/  (mirror)
    Bin-->>Term: "launching Squad shell…"

    Bin->>Squad: spawn node dist/cli-entry.js (SQUAD_BRAND_* env, stdio inherit)
    Squad->>FS: read .squad/agents/, skills/, team.md
    Squad-->>User: PWAGENT banner · agent roster · ◆ pwagent> prompt

    Note over User,Squad: User types free text or @agent · / for slash commands.

    User->>Squad: /quit
    Squad-->>Bin: exit 0
    Bin->>Term: process.exit(0)
Loading

Bootstrap — first-time setup on a new machine

sequenceDiagram
    actor User
    participant Term as Terminal
    participant Bin as pwagent binary
    participant Gh as gh CLI
    participant SDK as @github/copilot-sdk

    User->>Term: pwagent prereqs --install
    Term->>Bin: detect each prereq
    Bin-->>Term: report (node ✓, git ✓, gh ✗, gh-auth ✗, ...)
    Bin->>Term: prompt: install gh via winget? [Y/n]
    User->>Term: y
    Bin->>Term: winget install GitHub.cli
    Term-->>Bin: ok

    User->>Term: pwagent login
    Bin->>Gh: gh auth login --web --scopes read:user
    Gh-->>User: open browser
    User->>Gh: complete OAuth
    Gh-->>Bin: token stored in gh keychain

    User->>Term: pwagent init
    Bin->>User: prompt model + ADO org/project
    User-->>Bin: claude-sonnet-4.5, contoso, Engineering
    Bin->>Bin: write ~/.pwagent/config.json (atomic)

    User->>Term: pwagent doctor
    Bin->>SDK: import; resolve auth via gh
    SDK-->>Bin: reachable
    Bin-->>User: Ready. (features matrix)
Loading

Agent invocation

sequenceDiagram
    actor User
    participant CLI as pwagent run
    participant Coord as Coordinator
    participant Charter as charter.md (triage)
    participant Skills as skills/*.md
    participant SDK as Copilot SDK
    participant Tools as Tool runner
    participant Scribe

    User->>CLI: pwagent run triage --run-id 12345
    CLI->>Coord: dispatch via routing.md
    Coord->>Coord: pick response mode (Standard)
    Coord->>Charter: load triage charter
    Coord->>Skills: skill-aware match → core/flaky-tests.md
    Coord->>SDK: createSession({ systemMessage: charter + skill, tools, streaming: true })

    SDK-->>Coord: session ready
    Coord->>SDK: send({ prompt: "classify run 12345" })

    loop tool loop
        SDK-->>Tools: tool.execution_start(read_file)
        Tools-->>SDK: result
        SDK-->>Coord: assistant.message_delta (stream)
        Coord-->>Scribe: append to log
    end

    SDK-->>Coord: session.idle
    Coord-->>User: verdict: ProductBug (confidence 0.82) — needs HITL stamp
    Coord->>Scribe: save artifact + transcript
Loading

Scheduler tick

sequenceDiagram
    participant Loop as pwagent scheduler<br/>(@bradygaster/squad-scheduler)
    participant Store as state.json
    participant Lock as <id>.lock
    participant Disp as squad-scheduler runner
    participant CLI as squad / shell command
    participant Events as <id>.jsonl

    loop every 5s
        Loop->>Store: which jobs are due?
        alt one or more due
            Loop->>Lock: acquire <job-id>.lock
            Loop->>Events: job_start
            Loop->>Disp: spawn squad @agent / command
            Disp->>CLI: child_process.spawn
            CLI-->>Disp: exit code + stdout
            alt success
                Disp->>Events: job_end (ok)
                Loop->>Store: update lastRunAt + nextDueAt
            else failure
                Disp->>Events: job_error
                Loop->>Store: ++consecutiveFailures
                alt failures >= disableAfterFailures
                    Loop->>Store: set enabled=false
                    Loop->>Events: job_auto_disabled
                end
            end
            Loop->>Lock: release
        end
    end
Loading

Portal browsing agents

sequenceDiagram
    actor User
    participant Browser
    participant Next as Next.js SSR
    participant FS as filesystem
    participant Dist as ../dist/content/agents/

    User->>Browser: GET http://127.0.0.1:7337/agents
    Browser->>Next: request /agents
    Next->>FS: stat candidate roots
    FS-->>Next: <repo>/dist/content/agents exists
    Next->>FS: readdir + read each charter.md
    FS-->>Next: 10 charters with frontmatter
    Next->>Next: parse with gray-matter, sort
    Next-->>Browser: SSR HTML (card per charter)
    Browser-->>User: render with sidebar + header + footer
Loading

Status & roadmap

v0.3 — current

Capability Status
npm workspaces monorepo — single npm install covers cli/ + portal/
Simplified roster — 10 specialist agents (generate, heal, plan, scenario, report, validate, auth, triage, review, supervisor)
Skill packs reorganizedkusto/SKILL.md, ado/SKILL.md, a11y/SKILL.md (Squad-shape) split from external/ and core/
Workspace override priority.pwagent/ preferred, .squad/ fallback for Squad-scaffolded repos
Squad SKILL.md shape recognizednpx squad init skills load verbatim
Sample configpwagent.config.example.json at repo root
CLI entrypoint, sub-commands
Embedded 10 charters + 60+ skill guides across 7 packs
Master prompt + routing.md / ceremonies.md / team.md embedded
Charter / skill loaders with workspace > user > embedded override chain
Prereqs matrix, detection, interactive install (winget/brew/apt/dnf/pacman)
gh login / whoami / logout + gh-login installer kind
Config CRUD with zod validation + atomic writes
Coordinator runtime: charter loading, model selection, tool allowlist, skill-aware injection
Tool sandbox: read, write, edit, bash (allowlisted binaries), grep (rg-aware)
@github/copilot-sdk provider adapter (streaming, tool loop, idle timeout)
Scheduler: tick loop, locks, hot-reload, retry, auto-disable, JSONL events, 4 seed jobs
Audit: JSONL append, audit tail, audit export with filters
pwagent review HITL stamp loop with persistent queue
pwagent ralph in-session driver
pwagent portal start launcher for Next.js dashboard
Portal: shell (header, footer, collapsible sidebar with shadcn/ui)
Portal: live /api/jobs, /api/jobs/[id], /api/events/jobs/[id] (SSE)
Portal: Server Actions — enable/disable, fire-now
Portal: live jobs list + job-detail page with real-time event tail
Portal: Playwright CLI page
64+ vitest tests across config / charters / skills / prereqs / scheduler / CLI smoke

v0.4 — what just landed

Capability Status
R4pwagent doctor live Copilot reachability probe (imports + connects the SDK, categorises failure: sdk-missing / auth-missing / unknown / ok). --no-probe skips it; --probe-timeout <ms> tunes it
Doctor clean-exit — probe cleanup + process.exit so the shell prompt returns immediately on completion
Detector resilience — trust version regex over non-zero exit (fixes false-negative for npx playwright --version deprecation shim)
PT4 audit/audit reads ~/.pwagent/audit/events.jsonl, filters by time / type / agent / free-text, downloads as .jsonl or .json via /api/audit/export
PT4 config/config form editor with diff preview before save, per-field zod-style validation, atomic write to ~/.pwagent/config.json via Server Actions
PT5 bearer auth — per-install secret at ~/.pwagent/portal/secret (chmod 600). HttpOnly cookie pwagent_session; every Server Action calls requireWriteAuth()
PT5 loopbackportal/middleware.ts rejects non-loopback hosts + X-Forwarded-* headers. --bind-all flag opts out behind a trusted proxy
PT5 --read-onlypwagent portal start --read-only sets PWAGENT_PORTAL_READ_ONLY=1; Server Actions short-circuit; banner shown across pages
PT5 service installerspwagent service install / uninstall / status writes Task Scheduler XML (Windows), launchd plist (macOS), or systemd --user unit (Linux). Calls pwagent scheduler start from the platform service.

Contributing

Dev workflow

# CLI
cd D:\gith\pwagent\cli
npm run dev          # tsx src/index.ts -- <args>
npm run typecheck
npm run lint
npm test
npm test -- --watch

# Portal
cd D:\gith\pwagent\portal
npm run dev          # http://127.0.0.1:7337
npm run build        # production build
npm run typecheck

Editing charters

Charters live in cli/src/content/agents/. Each is a charter.md style file with name: and description: frontmatter. After editing, run npm run build to copy them into dist/content/.

Adding a skill

  1. Drop a *.md file into the appropriate pack under cli/src/content/skills/.
  2. Include frontmatter:
    ---
    name: my-skill
    description: One-line summary; the coordinator uses this for skill-aware spawn.
    ---
  3. npm run build. pwagent skills list should show it.

Adding a prereq

Edit cli/src/prereqs/matrix.ts — append a Prereq object with id, label, reason, tier, detect, installers, unlocks. Tests in cli/tests/prereqs.test.ts will catch incomplete shapes.

Charter authoring contract

A charter is a single Markdown file with stable section headers. Example skeleton:

---
name: triage
description: Classify a Playwright failure into ProductBug / TestCodeBug / Environment / Inconclusive.
---

# Triage

## Identity
- **Role:** Failure triage
- **Project:** pwagent

## Responsibilities
- Read the failure artifact (trace, screenshot, console log)
- Emit a verdict + confidence + brief rationale
- ...

## Boundaries
- You do NOT patch code. That's the heal agent.

## Tools
- read, grep, bash (npx playwright show-trace)

## Model
- Preferred: claude-sonnet-4.5

Common pitfalls when authoring:

Pitfall Why it bites Fix
Renaming the ## Members header in team.md GitHub workflows hard-parse this header Keep it exactly ## Members
Writing behaviour into the master prompt instead of the charter Charters are reloaded per spawn; master-prompt changes affect the coordinator only Put agent rules in charter.md
Forgetting ## Boundaries Agents will happily do each other's jobs Add explicit "You do NOT do X"
Skipping the routing.md entry Coordinator falls back to "best guess" routing Add a routing row for every new agent
Overriding Scribe or Ralph They're framework-owned Don't write charters for them beyond what ships
Long sequential spawns Wastes wall-clock time Spawn independent agents in ONE turn

License

MIT. See LICENSE.


External references

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors