Skip to content

Add setup-aws.sh: idempotent AWS provisioning for the EKS MVP (#309)#317

Merged
ealt merged 5 commits into
mainfrom
impl/issue-309-setup-aws
Jun 16, 2026
Merged

Add setup-aws.sh: idempotent AWS provisioning for the EKS MVP (#309)#317
ealt merged 5 commits into
mainfrom
impl/issue-309-setup-aws

Conversation

@ealt

@ealt ealt commented Jun 16, 2026

Copy link
Copy Markdown
Owner

Summary

  • Adds reference/scripts/setup-aws/setup-aws.sh: idempotent, create-if-absent provisioning of everything the EDEN Helm chart needs upstream of setup-experiment-helm.sh on AWS (issue Idempotent AWS provisioning script for the EKS MVP (setup-aws.sh) #309, AWS MVP milestone). Replaces the chart README's manual AWS checklist with a one-pass script; the manual steps stay as the substrate-agnostic fallback.
  • Five steps, each probe-then-create and convergent on re-run (the setup-experiment.sh / repo_init idempotency idiom): EKS verify-or-create (+ IAM OIDC provider + the aws-ebs-csi-driver addon that PVC provisioning needs on modern EKS), ECR repo + reference-image build/push, RDS Postgres in the cluster VPC (AWS-managed master password, re-read from Secrets Manager on re-run) or an operator --postgres-dsn, S3 bucket + IRSA role scoped to exactly the chart's <fullname>-task-store-server ServiceAccount, and a generated 0600 Helm values file + the exact setup-experiment-helm.sh handoff invocation.
  • No fictional defaults: every resource name + region is operator-supplied; a missing flag fails loud naming the flag. --dry-run prints every mutating command verbatim through a single gate and redacts all secret material from the preview. State reads go through probe_* functions overridable via EDEN_SETUP_AWS_MOCK, which drives the offline test-setup-aws.sh (97 checks, bash-3.2-clean).

Advances the AWS MVP milestone (the chart deploys on EKS as of 13a/13c/13d; this makes the upstream AWS provisioning a script instead of a manual checklist).

What this does NOT cover

Fresh-operator walkthrough

  • A fresh-operator walkthrough was performed against the changed surface (the new CLI).
  • Notes: ran --help (usage lists required vs create-only-required flags); ran with a missing required flag → fails loud naming it (setup-aws.sh: --ecr-repo is required); ran a full fresh-account --dry-run against the mock — emits the 5 steps in dependency order, the values-file preview shows secrets as <generated> and the DSN userinfo as <redacted>, and the handoff prints the exact setup-experiment-helm.sh --values invocation with --namespace/--release matching the IRSA trust subject. Passed cleanly.

Test plan

  • shellcheck clean on both scripts.
  • bash-3.2 compatibility — suite green under macOS /bin/bash 3.2.57 (no mapfile / declare -A).
  • bash reference/scripts/setup-aws/test-setup-aws.sh — all 97 checks pass (all-absent / all-present / partial-state / interrupted-create / foreign-policy / flag-validation fixtures).
  • --dry-run end-to-end against the mock — correct command sequence + skip paths + secret redaction.
  • Pre-push gate (rename-discipline, complexity-gate, ruff, pyright, docs-lint) — clean.
  • npx markdownlint-cli2@0.14.0 (CI-pinned) — 0 errors.
  • Synchronous codex review, 3 rounds (round 0 fix-then-ship: 1×P1 + 6×P2 all addressed; round 1 → semantic policy check; round 2 ship). Record under docs/plans/review/setup-aws/impl/.

Related issues

🤖 Generated with Claude Code

ealt and others added 5 commits June 16, 2026 12:56
Provisions everything upstream of setup-experiment-helm.sh with
create-if-absent semantics (issue #309): EKS verify-or-create (+ OIDC
provider + aws-ebs-csi-driver addon), ECR repo + reference-image
build/push, RDS Postgres in the cluster VPC (AWS-managed master
password, re-read from Secrets Manager on re-run) or an operator
--postgres-dsn, S3 bucket + IRSA role scoped to the chart's
task-store-server ServiceAccount, and a generated Helm values file +
the exact setup-experiment-helm.sh handoff invocation.

--dry-run prints every mutating command verbatim; all state reads go
through mockable probe_* functions (EDEN_SETUP_AWS_MOCK), driven by
the offline test harness test-setup-aws.sh (86 checks across
all-absent / all-present / partial / flag-validation fixtures).
Bash-3.2-clean and shellcheck-clean. Chart README + helm.md
prerequisites now point at the script, keeping the manual checklist
as the substrate-agnostic fallback.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
P1: the EBS CSI addon probe accepted any addon status — a
CREATE_FAILED/DEGRADED addon skip-converged as healthy. Now requires
ACTIVE, waits on CREATING/UPDATING (aws eks wait addon-active), and
fails loud on broken states.

P2s: converge an interrupted 'eksctl create cluster' (CREATING →
wait cluster-active, not the fatal branch); atomic tmp+rename values
write so a crash can't leave a partial file that silently rotates
secrets on the next run; --dry-run preview now redacts all secret
material (<preserved>/<generated> markers; DSN userinfo masked);
an existing same-named IAM policy is content-validated against the
bucket ARN instead of silently adopted; the IRSA trust check is a
semantic JSON comparison (single statement, federated principal,
:sub AND :aud) instead of substring matching; the DB ingress probe
now requires IsEgress=false + tcp + FromPort=ToPort=5432.

Test harness: mocks updated for the strengthened probes; new cases
for cluster-CREATING convergence and foreign-policy fail-loud;
redaction assertions (97 checks, all passing under bash 3.2).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The existing-policy acceptance test now parses the document and
requires Allow statements granting s3:GetObject + s3:PutObject on the
bucket's objects and s3:ListBucket on the bucket (NotAction/NotResource
statements excluded), instead of a substring scan that a Deny merely
mentioning the ARN would have passed.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Planless-chunk completion discipline: CHANGELOG [Unreleased] entry with
deferral issues (#314 teardown, #315 shell CI gate; ingress already
final verdict ship) under docs/plans/review/setup-aws/impl/.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@ealt ealt enabled auto-merge (squash) June 16, 2026 20:00
@ealt ealt merged commit 69c8531 into main Jun 16, 2026
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Idempotent AWS provisioning script for the EKS MVP (setup-aws.sh)

1 participant