skills: agent skills home (devops-bench-review + run-parallel-evals + run-eval)#132

Merged

pradeepvrd merged 1 commit into

feat/bastion-matrixfrom

skills/agent-skills

Jun 27, 2026

pradeepvrd commented Jun 25, 2026 •

edited

Loading

Collaborator

Summary

A dedicated home for agent skills / guidelines so they can evolve independently of feature PRs.
Stacked on #131 (feat/bastion-matrix).

Three skills:

devops-bench-review (new) — review-only review of a PR or the current workspace across
four lenses: correctness; parallel-safety across the eval matrix axes (Task × Model ×
AgentConfig) — the emphasis, with a shared-state checklist and per-axis reasoning; task & stack
conventions; and docs conventions. It analyzes statically and may run unit tests / ruff
lint+format checks, but never runs benchmark evals or provisions infra.
run-parallel-evals — drives the full parallel matrix; harness-agnostic with an Antigravity
portability map and local/remote execution modes. (Relocated here so all skills sit together.)
run-eval (new) — drive a single Task × Model × AgentConfig run end to end (a 1×1×1
matrix); reuses run-parallel-evals' wrappers and recovery/reference files.

Each skill is a source dir under .agents/skills/<name>/ plus a .claude/skills/ discovery symlink
(both force-added, since .agents/.claude are in .git/info/exclude).

Stacking / dependencies

Base: feat(bastion): parallel Task×Model×AgentConfig eval matrix + Vertex auth #131 (feat/bastion-matrix) — provides docs/bastion.md and scripts/bastion/* that
run-parallel-evals / run-eval reference.
run-parallel-evals also references docs/parallel-evals.md, which lands in docs(parallel-evals): parallel evaluation runbook + known issues #126
(feat/parallel-eval-runs); the skill is fully wired once docs(parallel-evals): parallel evaluation runbook + known issues #126 is in the merge path.

pradeepvrd added a commit that referenced this pull request


          skills: relocate run-parallel-evals to the dedicated skills PR

dea9096

The run-parallel-evals skill now lives in the standalone agent-skills PR (#132)
so skills evolve independently of this feature branch. The docs it references
(docs/parallel-evals.md, docs/bastion.md) and scripts/bastion/* remain here.

pradeepvrd added a commit that referenced this pull request


          skills: drop dangling .claude/skills symlink for relocated skill

f453b24

The run-parallel-evals skill (and its .claude/skills discovery symlink) now live
in the standalone agent-skills PR (#132).

pradeepvrd force-pushed the skills/agent-skills branch from e3dd11b to da32df0 Compare

June 25, 2026 18:10

pradeepvrd changed the base branch from main to feat/bastion-matrix

June 25, 2026 18:10

jessie1111101 mentioned this pull request

feat(tasks): make all benchmark tasks parallel-run safe #133

Closed

pradeepvrd force-pushed the skills/agent-skills branch from da32df0 to f1c4194 Compare

June 25, 2026 20:29

jessie1111101 added a commit that referenced this pull request


          feat(tasks): make all benchmark tasks parallel-run safe

7fabfda

Stacked on #132 (skills/agent-skills). Each matrix combo provisions its own
cluster; this makes every task collision-free under concurrent runs:

- 6 manifest-gen tasks -> deployer: noop (no cluster); legacy factory honors noop
- optimize-scale: new prebuilt/optimize-scale GKE stack + pre-seeded workload;
  matrix pins TARGET_DEPLOYMENT_NAME/NAMESPACE so both arms agree
- deploy-hello-app: run-unique Artifact Registry repo name
- per-run tofu stack-dir copy (both arms) removes the shared .terraform.lock race
  (resolves the 'Shared OpenTofu working directory' known-issue)
- import + parallel-fix the merged complex/GKE tasks (#64 migration, #87 opa,
  #93 multi-region, #86 postgres/unhealthy/gitops, #76 debug-crashloop):
  per-run GitOps repo paths, dropped shared-SA container.admin (BYO creds),
  region-prefixed cluster names (avoid node-SA substr collision), unique task_id
- cp-recovery documented as the kind-only exception (docs/bastion.md)

pradeepvrd force-pushed the feat/bastion-matrix branch from e64394a to 7ac941f Compare

June 26, 2026 01:10

pradeepvrd added a commit that referenced this pull request


          skills: relocate run-parallel-evals to the dedicated skills PR

4f7677f

The run-parallel-evals skill now lives in the standalone agent-skills PR (#132)
so skills evolve independently of this feature branch. The docs it references
(docs/parallel-evals.md, docs/bastion.md) and scripts/bastion/* remain here.

pradeepvrd added a commit that referenced this pull request


          skills: drop dangling .claude/skills symlink for relocated skill

33988b3

The run-parallel-evals skill (and its .claude/skills discovery symlink) now live
in the standalone agent-skills PR (#132).

pradeepvrd force-pushed the skills/agent-skills branch from 7588a3e to ed54798 Compare

June 26, 2026 01:11

pradeepvrd pushed a commit that referenced this pull request


          feat(tasks): make all benchmark tasks parallel-run safe

e9bb869

Stacked on #132 (skills/agent-skills). Each matrix combo provisions its own
cluster; this makes every task collision-free under concurrent runs:

- 6 manifest-gen tasks -> deployer: noop (no cluster); legacy factory honors noop
- optimize-scale: new prebuilt/optimize-scale GKE stack + pre-seeded workload;
  matrix pins TARGET_DEPLOYMENT_NAME/NAMESPACE so both arms agree
- deploy-hello-app: run-unique Artifact Registry repo name
- per-run tofu stack-dir copy (both arms) removes the shared .terraform.lock race
  (resolves the 'Shared OpenTofu working directory' known-issue)
- import + parallel-fix the merged complex/GKE tasks (#64 migration, #87 opa,
  #93 multi-region, #86 postgres/unhealthy/gitops, #76 debug-crashloop):
  per-run GitOps repo paths, dropped shared-SA container.admin (BYO creds),
  region-prefixed cluster names (avoid node-SA substr collision), unique task_id
- cp-recovery documented as the kind-only exception (docs/bastion.md)

pradeepvrd added a commit that referenced this pull request


          skills: relocate run-parallel-evals to the dedicated skills PR

740ef9d

The run-parallel-evals skill now lives in the standalone agent-skills PR (#132)
so skills evolve independently of this feature branch. The docs it references
(docs/parallel-evals.md, docs/bastion.md) and scripts/bastion/* remain here.

pradeepvrd added a commit that referenced this pull request


          skills: drop dangling .claude/skills symlink for relocated skill

The run-parallel-evals skill (and its .claude/skills discovery symlink) now live
in the standalone agent-skills PR (#132).

pradeepvrd force-pushed the skills/agent-skills branch from 6b2d4b4 to 6fef3ff Compare

June 26, 2026 04:21

pradeepvrd force-pushed the feat/bastion-matrix branch from a1b6078 to b313cdf Compare

June 26, 2026 21:49

pradeepvrd added a commit that referenced this pull request


          skills: relocate run-parallel-evals to the dedicated skills PR

de43144

The run-parallel-evals skill now lives in the standalone agent-skills PR (#132)
so skills evolve independently of this feature branch. The docs it references
(docs/parallel-evals.md, docs/bastion.md) and scripts/bastion/* remain here.

pradeepvrd added a commit that referenced this pull request


          skills: drop dangling .claude/skills symlink for relocated skill

e4f1eb5

The run-parallel-evals skill (and its .claude/skills discovery symlink) now live
in the standalone agent-skills PR (#132).

pradeepvrd force-pushed the skills/agent-skills branch from 6fef3ff to 121e7fb Compare

June 26, 2026 21:49

pradeepvrd pushed a commit that referenced this pull request


          feat(tasks): make all benchmark tasks parallel-run safe

d915c63

Stacked on #132 (skills/agent-skills). Each matrix combo provisions its own
cluster; this makes every task collision-free under concurrent runs:

- 6 manifest-gen tasks -> deployer: noop (no cluster); legacy factory honors noop
- optimize-scale: new prebuilt/optimize-scale GKE stack + pre-seeded workload;
  matrix pins TARGET_DEPLOYMENT_NAME/NAMESPACE so both arms agree
- deploy-hello-app: run-unique Artifact Registry repo name
- per-run tofu stack-dir copy (both arms) removes the shared .terraform.lock race
  (resolves the 'Shared OpenTofu working directory' known-issue)
- import + parallel-fix the merged complex/GKE tasks (#64 migration, #87 opa,
  #93 multi-region, #86 postgres/unhealthy/gitops, #76 debug-crashloop):
  per-run GitOps repo paths, dropped shared-SA container.admin (BYO creds),
  region-prefixed cluster names (avoid node-SA substr collision), unique task_id
- cp-recovery documented as the kind-only exception (docs/bastion.md)

pradeepvrd force-pushed the feat/bastion-matrix branch from b313cdf to e9c740d Compare

June 26, 2026 22:22

pradeepvrd added a commit that referenced this pull request


          skills: relocate run-parallel-evals to the dedicated skills PR

42e0918

The run-parallel-evals skill now lives in the standalone agent-skills PR (#132)
so skills evolve independently of this feature branch. The docs it references
(docs/parallel-evals.md, docs/bastion.md) and scripts/bastion/* remain here.

pradeepvrd added a commit that referenced this pull request


          skills: drop dangling .claude/skills symlink for relocated skill

90f8350

The run-parallel-evals skill (and its .claude/skills discovery symlink) now live
in the standalone agent-skills PR (#132).

pradeepvrd force-pushed the skills/agent-skills branch from 121e7fb to 377a5aa Compare

June 26, 2026 22:22

pradeepvrd pushed a commit that referenced this pull request


          feat(tasks): make all benchmark tasks parallel-run safe

e5a819c

Stacked on #132 (skills/agent-skills). Each matrix combo provisions its own
cluster; this makes every task collision-free under concurrent runs:

- 6 manifest-gen tasks -> deployer: noop (no cluster); legacy factory honors noop
- optimize-scale: new prebuilt/optimize-scale GKE stack + pre-seeded workload;
  matrix pins TARGET_DEPLOYMENT_NAME/NAMESPACE so both arms agree
- deploy-hello-app: run-unique Artifact Registry repo name
- per-run tofu stack-dir copy (both arms) removes the shared .terraform.lock race
  (resolves the 'Shared OpenTofu working directory' known-issue)
- import + parallel-fix the merged complex/GKE tasks (#64 migration, #87 opa,
  #93 multi-region, #86 postgres/unhealthy/gitops, #76 debug-crashloop):
  per-run GitOps repo paths, dropped shared-SA container.admin (BYO creds),
  region-prefixed cluster names (avoid node-SA substr collision), unique task_id
- cp-recovery documented as the kind-only exception (docs/bastion.md)

pradeepvrd force-pushed the feat/bastion-matrix branch from e9c740d to 460cc45 Compare

June 27, 2026 01:54

pradeepvrd added a commit that referenced this pull request


          docs(parallel-evals): parallel evaluation runbook + known issues

8e074c9

Add docs/parallel-evals.md: the end-to-end parallel-evaluation runbook (matrix CUJs,
parallel-safety rules, resume-after-drop, Vertex setup), known issues from review
findings, and the local-default / BENCH_REMOTE execution note. Docs-only; the
run-parallel-evals skill lives in the skills PR (#132).


          skills: agent skills home (devops-bench-review + run-parallel-evals +…

fdb29d1

… run-eval)

A dedicated home for agent skills so they evolve independently of feature PRs:
- devops-bench-review (new): review-only review across correctness, parallel-safety
  across the eval matrix axes (Task × Model × AgentConfig), task/stack conventions,
  and docs conventions; runs unit tests / ruff only — never evals or infra.
- run-parallel-evals: relocated here so all skills sit together; harness-agnostic with
  an Antigravity portability map and local/remote execution modes.
- run-eval (new): drive a single Task × Model × AgentConfig run end to end (a 1×1×1
  matrix); reuses run-parallel-evals' wrappers and recovery/reference files.
Each skill is a source dir under .agents/skills/<name>/ plus a .claude/skills/ discovery
symlink (force-added; .agents/.claude are git-excluded).

pradeepvrd force-pushed the skills/agent-skills branch from 377a5aa to fdb29d1 Compare

June 27, 2026 01:55

pradeepvrd mentioned this pull request

docs(parallel-evals): parallel evaluation runbook + known issues #126

Merged

pradeepvrd changed the title ~~skills: agent skills home (devops-bench-review + run-parallel-evals)~~ skills: agent skills home (devops-bench-review + run-parallel-evals + run-eval)

pradeepvrd merged commit 9e3207f into feat/bastion-matrix

1 check passed

pradeepvrd mentioned this pull request

feat: parallel evaluation harness — isolation, bastion matrix, skills & docs (consolidated) #128

Merged

pradeepvrd added a commit that referenced this pull request


          docs(parallel-evals): parallel evaluation runbook + known issues

eb8b318

Add docs/parallel-evals.md: the end-to-end parallel-evaluation runbook (matrix CUJs,
parallel-safety rules, resume-after-drop, Vertex setup), known issues from review
findings, and the local-default / BENCH_REMOTE execution note. Docs-only; the
run-parallel-evals skill lives in the skills PR (#132).

pradeepvrd added a commit that referenced this pull request


          docs(parallel-evals): parallel evaluation runbook + known issues

a0ba40e

Add docs/parallel-evals.md: the end-to-end parallel-evaluation runbook (matrix CUJs,
parallel-safety rules, resume-after-drop, Vertex setup), known issues from review
findings, and the local-default / BENCH_REMOTE execution note. Docs-only; the
run-parallel-evals skill lives in the skills PR (#132).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet