Skip to content

Plan: Kubernetes-native subprocess + pod-exec worker hosts (#291)#316

Merged
ealt merged 1 commit into
mainfrom
plan/issue-291-k8s-worker-modes
Jun 12, 2026
Merged

Plan: Kubernetes-native subprocess + pod-exec worker hosts (#291)#316
ealt merged 1 commit into
mainfrom
plan/issue-291-k8s-worker-modes

Conversation

@ealt

@ealt ealt commented Jun 12, 2026

Copy link
Copy Markdown
Owner

What this is

Plan for #291 (plan stage only — no implementation). Adds docs/plans/eden-phase-13f-k8s-worker-modes.md, the chunk plan for bringing user-supplied *_command worker modes to the Helm/Kubernetes substrate: the merged 13a chart runs scripted-mode reference workers only, so today "infra validates on EKS" but real experiments can't run there.

What the plan resolves

Review

Plan-stage codex-review (plan profile), 4 rounds to convergence; record committed under docs/plans/review/eden-phase-13f-k8s-worker-modes/20260611T150217/. Round 0–3 caught: Forgejo-credential exposure in the 13b wrapper-push design (fixed via publisher container), missing push read-back ladder, missing claim expiry, hostile-clone trust root (fixed via quarantine fetch), task-pod SA tokens, evaluator sentinel robustness (fixed via reporter container), and the absolute deadline.

What this does NOT cover

  • Any implementation — that's the 13f impl chunk, executed against this plan (waves 1–5 in §9).
  • The physical move of the superseded 13b plan to docs/archive/ (rides 13f's final docs wave; banner added now).
  • Closing Phase 13b — Executor as a k8s Job (GPU node selection) #172 (closes at pod-exec-wave merge per the plan; a pointer comment lands on it at plan-merge).
  • The §11 deferral candidates (per-experiment GPU overrides, podFailurePolicy, static wrapper binary, sentinel nonce + per-task credential Secret, shallow clone, NetworkPolicy, hardened isolation, branch protection) — each named in the plan with issue-filing intent at the wave that makes it real, per the deferral-tracking rule.

Validation

markdownlint (pinned CI version, full sweep) ✓ · scripts/check-rename-discipline.py ✓ · scripts/spec-xref-check.py ✓. Docs-only change; no code paths touched.

Plan for #291 — does NOT close it.

🤖 Generated with Claude Code

Plan for #291. Adds docs/plans/eden-phase-13f-k8s-worker-modes.md:
per-role workers.<role>.mode chart values (subprocess in-pod for all
three roles; pod-exec per-task k8s Jobs for executor + evaluator),
experiment-image story, shared-artifacts-PVC interim (#285/#290
reconciliation), bundled env-isolation + claim-expiry host changes,
publisher-container credential confinement with quarantine-fetch +
push read-back ladder, opt-in namespace RBAC, and two kind smokes.

13f subsumes 13b (#172): the 13b plan gets a superseded banner (the
docs/archive move rides 13f's final wave), the roadmap 13b line is
re-pointed, and a 13f roadmap line is added. #172 closes when the
pod-exec wave ships.

Plan-stage codex-review record (4 rounds to convergence) committed
under docs/plans/review/eden-phase-13f-k8s-worker-modes/.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@ealt ealt force-pushed the plan/issue-291-k8s-worker-modes branch from 7e270b4 to 8acb404 Compare June 12, 2026 20:54
@ealt ealt merged commit 2294cf1 into main Jun 12, 2026
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant