Skip to content

Add task-generation framework + first task (debug-crashloop)#76

Open
adrianchung wants to merge 1 commit into
gke-labs:mainfrom
adrianchung:add-task-generation-framework
Open

Add task-generation framework + first task (debug-crashloop)#76
adrianchung wants to merge 1 commit into
gke-labs:mainfrom
adrianchung:add-task-generation-framework

Conversation

@adrianchung

@adrianchung adrianchung commented Jun 15, 2026

Copy link
Copy Markdown
Collaborator

Summary

A repeatable way to generate DevOps Bench tasks from an expert catalog and run them against the framework, plus the first validated task.

Scope

  • docs/task-generation/ — methodology (schema, ID allocation, expected_output rules, task classes, cluster access via the GKE MCP server), the expert task catalog (source of truth) + generation tracker, and a run + leaderboard guide
  • AGENTS.md — vendor-neutral agent guidance pointing at the methodology
  • tasks/generic/debug-crashloop/ — first generated task + a CrashLoopBackOff fixture (root cause: missing DATABASE_URL)
  • pkg/agents/runner/api/mcp_client.py — forward the environment to the MCP server subprocess so it inherits KUBECONFIG/cloud creds (otherwise the MCP server can't resolve the target cluster's kubeconfig context)

A repeatable way to generate DevOps Bench tasks from an expert catalog and run
them, plus the first validated task:

- docs/task-generation/: methodology (task.yaml schema, ID allocation,
  expected_output rules, task classes, cluster access via the GKE MCP server),
  the expert task catalog as source of truth with a generation tracker, and a
  run + leaderboard guide.
- AGENTS.md: vendor-neutral agent guidance pointing at the methodology.
- tasks/generic/debug-crashloop/: first generated task with a CrashLoopBackOff
  fixture whose root cause is a missing DATABASE_URL env var.
- pkg/agents/runner/api/mcp_client.py: forward the environment to the MCP server
  subprocess so it inherits KUBECONFIG and cloud credentials; without this the
  MCP server cannot resolve the target cluster's kubeconfig context.
# otherwise launches the server with a stripped default environment, which
# leaves it unable to resolve the target cluster's kubeconfig context.
server_params = StdioServerParameters(
command=self.server_path, env=os.environ.copy()

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be dangerous especially since the agent will have access to the full environment context. I recommend being selective in what is being passed to the server.

jessie1111101 added a commit that referenced this pull request Jun 25, 2026
Stacked on #132 (skills/agent-skills). Each matrix combo provisions its own
cluster; this makes every task collision-free under concurrent runs:

- 6 manifest-gen tasks -> deployer: noop (no cluster); legacy factory honors noop
- optimize-scale: new prebuilt/optimize-scale GKE stack + pre-seeded workload;
  matrix pins TARGET_DEPLOYMENT_NAME/NAMESPACE so both arms agree
- deploy-hello-app: run-unique Artifact Registry repo name
- per-run tofu stack-dir copy (both arms) removes the shared .terraform.lock race
  (resolves the 'Shared OpenTofu working directory' known-issue)
- import + parallel-fix the merged complex/GKE tasks (#64 migration, #87 opa,
  #93 multi-region, #86 postgres/unhealthy/gitops, #76 debug-crashloop):
  per-run GitOps repo paths, dropped shared-SA container.admin (BYO creds),
  region-prefixed cluster names (avoid node-SA substr collision), unique task_id
- cp-recovery documented as the kind-only exception (docs/bastion.md)
pradeepvrd pushed a commit that referenced this pull request Jun 26, 2026
Stacked on #132 (skills/agent-skills). Each matrix combo provisions its own
cluster; this makes every task collision-free under concurrent runs:

- 6 manifest-gen tasks -> deployer: noop (no cluster); legacy factory honors noop
- optimize-scale: new prebuilt/optimize-scale GKE stack + pre-seeded workload;
  matrix pins TARGET_DEPLOYMENT_NAME/NAMESPACE so both arms agree
- deploy-hello-app: run-unique Artifact Registry repo name
- per-run tofu stack-dir copy (both arms) removes the shared .terraform.lock race
  (resolves the 'Shared OpenTofu working directory' known-issue)
- import + parallel-fix the merged complex/GKE tasks (#64 migration, #87 opa,
  #93 multi-region, #86 postgres/unhealthy/gitops, #76 debug-crashloop):
  per-run GitOps repo paths, dropped shared-SA container.admin (BYO creds),
  region-prefixed cluster names (avoid node-SA substr collision), unique task_id
- cp-recovery documented as the kind-only exception (docs/bastion.md)
pradeepvrd pushed a commit that referenced this pull request Jun 26, 2026
Stacked on #132 (skills/agent-skills). Each matrix combo provisions its own
cluster; this makes every task collision-free under concurrent runs:

- 6 manifest-gen tasks -> deployer: noop (no cluster); legacy factory honors noop
- optimize-scale: new prebuilt/optimize-scale GKE stack + pre-seeded workload;
  matrix pins TARGET_DEPLOYMENT_NAME/NAMESPACE so both arms agree
- deploy-hello-app: run-unique Artifact Registry repo name
- per-run tofu stack-dir copy (both arms) removes the shared .terraform.lock race
  (resolves the 'Shared OpenTofu working directory' known-issue)
- import + parallel-fix the merged complex/GKE tasks (#64 migration, #87 opa,
  #93 multi-region, #86 postgres/unhealthy/gitops, #76 debug-crashloop):
  per-run GitOps repo paths, dropped shared-SA container.admin (BYO creds),
  region-prefixed cluster names (avoid node-SA substr collision), unique task_id
- cp-recovery documented as the kind-only exception (docs/bastion.md)
pradeepvrd pushed a commit that referenced this pull request Jun 26, 2026
Stacked on #132 (skills/agent-skills). Each matrix combo provisions its own
cluster; this makes every task collision-free under concurrent runs:

- 6 manifest-gen tasks -> deployer: noop (no cluster); legacy factory honors noop
- optimize-scale: new prebuilt/optimize-scale GKE stack + pre-seeded workload;
  matrix pins TARGET_DEPLOYMENT_NAME/NAMESPACE so both arms agree
- deploy-hello-app: run-unique Artifact Registry repo name
- per-run tofu stack-dir copy (both arms) removes the shared .terraform.lock race
  (resolves the 'Shared OpenTofu working directory' known-issue)
- import + parallel-fix the merged complex/GKE tasks (#64 migration, #87 opa,
  #93 multi-region, #86 postgres/unhealthy/gitops, #76 debug-crashloop):
  per-run GitOps repo paths, dropped shared-SA container.admin (BYO creds),
  region-prefixed cluster names (avoid node-SA substr collision), unique task_id
- cp-recovery documented as the kind-only exception (docs/bastion.md)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants