Skip to content

skills: agent skills home (devops-bench-review + run-parallel-evals + run-eval)#132

Merged
pradeepvrd merged 1 commit into
feat/bastion-matrixfrom
skills/agent-skills
Jun 27, 2026
Merged

skills: agent skills home (devops-bench-review + run-parallel-evals + run-eval)#132
pradeepvrd merged 1 commit into
feat/bastion-matrixfrom
skills/agent-skills

Conversation

@pradeepvrd

@pradeepvrd pradeepvrd commented Jun 25, 2026

Copy link
Copy Markdown
Collaborator

Summary

A dedicated home for agent skills / guidelines so they can evolve independently of feature PRs.
Stacked on #131 (feat/bastion-matrix).

Three skills:

  • devops-bench-review (new) — review-only review of a PR or the current workspace across
    four lenses: correctness; parallel-safety across the eval matrix axes (Task × Model ×
    AgentConfig) — the emphasis, with a shared-state checklist and per-axis reasoning; task & stack
    conventions; and docs conventions. It analyzes statically and may run unit tests / ruff
    lint+format checks, but never runs benchmark evals or provisions infra.
  • run-parallel-evals — drives the full parallel matrix; harness-agnostic with an Antigravity
    portability map and local/remote execution modes. (Relocated here so all skills sit together.)
  • run-eval (new) — drive a single Task × Model × AgentConfig run end to end (a 1×1×1
    matrix); reuses run-parallel-evals' wrappers and recovery/reference files.

Each skill is a source dir under .agents/skills/<name>/ plus a .claude/skills/ discovery symlink
(both force-added, since .agents/.claude are in .git/info/exclude).

Stacking / dependencies

pradeepvrd added a commit that referenced this pull request Jun 25, 2026
The run-parallel-evals skill now lives in the standalone agent-skills PR (#132)
so skills evolve independently of this feature branch. The docs it references
(docs/parallel-evals.md, docs/bastion.md) and scripts/bastion/* remain here.
pradeepvrd added a commit that referenced this pull request Jun 25, 2026
The run-parallel-evals skill (and its .claude/skills discovery symlink) now live
in the standalone agent-skills PR (#132).
@pradeepvrd pradeepvrd force-pushed the skills/agent-skills branch from e3dd11b to da32df0 Compare June 25, 2026 18:10
@pradeepvrd pradeepvrd changed the base branch from main to feat/bastion-matrix June 25, 2026 18:10
@pradeepvrd pradeepvrd force-pushed the skills/agent-skills branch from da32df0 to f1c4194 Compare June 25, 2026 20:29
jessie1111101 added a commit that referenced this pull request Jun 25, 2026
Stacked on #132 (skills/agent-skills). Each matrix combo provisions its own
cluster; this makes every task collision-free under concurrent runs:

- 6 manifest-gen tasks -> deployer: noop (no cluster); legacy factory honors noop
- optimize-scale: new prebuilt/optimize-scale GKE stack + pre-seeded workload;
  matrix pins TARGET_DEPLOYMENT_NAME/NAMESPACE so both arms agree
- deploy-hello-app: run-unique Artifact Registry repo name
- per-run tofu stack-dir copy (both arms) removes the shared .terraform.lock race
  (resolves the 'Shared OpenTofu working directory' known-issue)
- import + parallel-fix the merged complex/GKE tasks (#64 migration, #87 opa,
  #93 multi-region, #86 postgres/unhealthy/gitops, #76 debug-crashloop):
  per-run GitOps repo paths, dropped shared-SA container.admin (BYO creds),
  region-prefixed cluster names (avoid node-SA substr collision), unique task_id
- cp-recovery documented as the kind-only exception (docs/bastion.md)
@pradeepvrd pradeepvrd force-pushed the feat/bastion-matrix branch from e64394a to 7ac941f Compare June 26, 2026 01:10
pradeepvrd added a commit that referenced this pull request Jun 26, 2026
The run-parallel-evals skill now lives in the standalone agent-skills PR (#132)
so skills evolve independently of this feature branch. The docs it references
(docs/parallel-evals.md, docs/bastion.md) and scripts/bastion/* remain here.
pradeepvrd added a commit that referenced this pull request Jun 26, 2026
The run-parallel-evals skill (and its .claude/skills discovery symlink) now live
in the standalone agent-skills PR (#132).
@pradeepvrd pradeepvrd force-pushed the skills/agent-skills branch from 7588a3e to ed54798 Compare June 26, 2026 01:11
pradeepvrd pushed a commit that referenced this pull request Jun 26, 2026
Stacked on #132 (skills/agent-skills). Each matrix combo provisions its own
cluster; this makes every task collision-free under concurrent runs:

- 6 manifest-gen tasks -> deployer: noop (no cluster); legacy factory honors noop
- optimize-scale: new prebuilt/optimize-scale GKE stack + pre-seeded workload;
  matrix pins TARGET_DEPLOYMENT_NAME/NAMESPACE so both arms agree
- deploy-hello-app: run-unique Artifact Registry repo name
- per-run tofu stack-dir copy (both arms) removes the shared .terraform.lock race
  (resolves the 'Shared OpenTofu working directory' known-issue)
- import + parallel-fix the merged complex/GKE tasks (#64 migration, #87 opa,
  #93 multi-region, #86 postgres/unhealthy/gitops, #76 debug-crashloop):
  per-run GitOps repo paths, dropped shared-SA container.admin (BYO creds),
  region-prefixed cluster names (avoid node-SA substr collision), unique task_id
- cp-recovery documented as the kind-only exception (docs/bastion.md)
pradeepvrd added a commit that referenced this pull request Jun 26, 2026
The run-parallel-evals skill now lives in the standalone agent-skills PR (#132)
so skills evolve independently of this feature branch. The docs it references
(docs/parallel-evals.md, docs/bastion.md) and scripts/bastion/* remain here.
pradeepvrd added a commit that referenced this pull request Jun 26, 2026
The run-parallel-evals skill (and its .claude/skills discovery symlink) now live
in the standalone agent-skills PR (#132).
@pradeepvrd pradeepvrd force-pushed the skills/agent-skills branch from 6b2d4b4 to 6fef3ff Compare June 26, 2026 04:21
@pradeepvrd pradeepvrd force-pushed the feat/bastion-matrix branch from a1b6078 to b313cdf Compare June 26, 2026 21:49
pradeepvrd added a commit that referenced this pull request Jun 26, 2026
The run-parallel-evals skill now lives in the standalone agent-skills PR (#132)
so skills evolve independently of this feature branch. The docs it references
(docs/parallel-evals.md, docs/bastion.md) and scripts/bastion/* remain here.
pradeepvrd added a commit that referenced this pull request Jun 26, 2026
The run-parallel-evals skill (and its .claude/skills discovery symlink) now live
in the standalone agent-skills PR (#132).
@pradeepvrd pradeepvrd force-pushed the skills/agent-skills branch from 6fef3ff to 121e7fb Compare June 26, 2026 21:49
pradeepvrd pushed a commit that referenced this pull request Jun 26, 2026
Stacked on #132 (skills/agent-skills). Each matrix combo provisions its own
cluster; this makes every task collision-free under concurrent runs:

- 6 manifest-gen tasks -> deployer: noop (no cluster); legacy factory honors noop
- optimize-scale: new prebuilt/optimize-scale GKE stack + pre-seeded workload;
  matrix pins TARGET_DEPLOYMENT_NAME/NAMESPACE so both arms agree
- deploy-hello-app: run-unique Artifact Registry repo name
- per-run tofu stack-dir copy (both arms) removes the shared .terraform.lock race
  (resolves the 'Shared OpenTofu working directory' known-issue)
- import + parallel-fix the merged complex/GKE tasks (#64 migration, #87 opa,
  #93 multi-region, #86 postgres/unhealthy/gitops, #76 debug-crashloop):
  per-run GitOps repo paths, dropped shared-SA container.admin (BYO creds),
  region-prefixed cluster names (avoid node-SA substr collision), unique task_id
- cp-recovery documented as the kind-only exception (docs/bastion.md)
@pradeepvrd pradeepvrd force-pushed the feat/bastion-matrix branch from b313cdf to e9c740d Compare June 26, 2026 22:22
pradeepvrd added a commit that referenced this pull request Jun 26, 2026
The run-parallel-evals skill now lives in the standalone agent-skills PR (#132)
so skills evolve independently of this feature branch. The docs it references
(docs/parallel-evals.md, docs/bastion.md) and scripts/bastion/* remain here.
pradeepvrd added a commit that referenced this pull request Jun 26, 2026
The run-parallel-evals skill (and its .claude/skills discovery symlink) now live
in the standalone agent-skills PR (#132).
@pradeepvrd pradeepvrd force-pushed the skills/agent-skills branch from 121e7fb to 377a5aa Compare June 26, 2026 22:22
pradeepvrd pushed a commit that referenced this pull request Jun 26, 2026
Stacked on #132 (skills/agent-skills). Each matrix combo provisions its own
cluster; this makes every task collision-free under concurrent runs:

- 6 manifest-gen tasks -> deployer: noop (no cluster); legacy factory honors noop
- optimize-scale: new prebuilt/optimize-scale GKE stack + pre-seeded workload;
  matrix pins TARGET_DEPLOYMENT_NAME/NAMESPACE so both arms agree
- deploy-hello-app: run-unique Artifact Registry repo name
- per-run tofu stack-dir copy (both arms) removes the shared .terraform.lock race
  (resolves the 'Shared OpenTofu working directory' known-issue)
- import + parallel-fix the merged complex/GKE tasks (#64 migration, #87 opa,
  #93 multi-region, #86 postgres/unhealthy/gitops, #76 debug-crashloop):
  per-run GitOps repo paths, dropped shared-SA container.admin (BYO creds),
  region-prefixed cluster names (avoid node-SA substr collision), unique task_id
- cp-recovery documented as the kind-only exception (docs/bastion.md)
@pradeepvrd pradeepvrd force-pushed the feat/bastion-matrix branch from e9c740d to 460cc45 Compare June 27, 2026 01:54
pradeepvrd added a commit that referenced this pull request Jun 27, 2026
Add docs/parallel-evals.md: the end-to-end parallel-evaluation runbook (matrix CUJs,
parallel-safety rules, resume-after-drop, Vertex setup), known issues from review
findings, and the local-default / BENCH_REMOTE execution note. Docs-only; the
run-parallel-evals skill lives in the skills PR (#132).
… run-eval)

A dedicated home for agent skills so they evolve independently of feature PRs:
- devops-bench-review (new): review-only review across correctness, parallel-safety
  across the eval matrix axes (Task × Model × AgentConfig), task/stack conventions,
  and docs conventions; runs unit tests / ruff only — never evals or infra.
- run-parallel-evals: relocated here so all skills sit together; harness-agnostic with
  an Antigravity portability map and local/remote execution modes.
- run-eval (new): drive a single Task × Model × AgentConfig run end to end (a 1×1×1
  matrix); reuses run-parallel-evals' wrappers and recovery/reference files.
Each skill is a source dir under .agents/skills/<name>/ plus a .claude/skills/ discovery
symlink (force-added; .agents/.claude are git-excluded).
@pradeepvrd pradeepvrd changed the title skills: agent skills home (devops-bench-review + run-parallel-evals) skills: agent skills home (devops-bench-review + run-parallel-evals + run-eval) Jun 27, 2026
@pradeepvrd pradeepvrd merged commit 9e3207f into feat/bastion-matrix Jun 27, 2026
1 check passed
pradeepvrd added a commit that referenced this pull request Jun 27, 2026
Add docs/parallel-evals.md: the end-to-end parallel-evaluation runbook (matrix CUJs,
parallel-safety rules, resume-after-drop, Vertex setup), known issues from review
findings, and the local-default / BENCH_REMOTE execution note. Docs-only; the
run-parallel-evals skill lives in the skills PR (#132).
pradeepvrd added a commit that referenced this pull request Jun 27, 2026
Add docs/parallel-evals.md: the end-to-end parallel-evaluation runbook (matrix CUJs,
parallel-safety rules, resume-after-drop, Vertex setup), known issues from review
findings, and the local-default / BENCH_REMOTE execution note. Docs-only; the
run-parallel-evals skill lives in the skills PR (#132).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant