From 23ee7ed04cf7d59aeff37e62bdb37c85f27c49d2 Mon Sep 17 00:00:00 2001 From: Scot Wells Date: Sun, 28 Jun 2026 16:45:31 -0500 Subject: [PATCH] feat(milo-integration): local env exercising the full IPAM<->Milo path MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Adds a reusable local integration environment that runs the IPAM apiserver wired to a real, in-cluster Milo control plane, so the full IPAM<->Milo path can be exercised end-to-end: delegated authn/authz, quota enforcement, and the entitlement -> grant -> claim chain. This is the path the standalone e2e (--enable-quota=false, no Milo) cannot cover. Artifacts: - config/overlays/milo-integration/ — additive IPAM overlay (quota ON, all three delegation kubeconfig flags pointed at the in-cluster milo-apiserver via a Secret, NetworkPolicy egress to milo-system:6443). Dev/test-infra overlays are untouched. - config/overlays/milo-integration/quota/ — quota.miloapis.com primitives applied to Milo (ResourceRegistration + ClaimCreationPolicy + GrantCreationPolicy) that register the IPAM quotable type and auto-create the per-project grant/bucket/claim. Reproduces what the services.miloapis.com catalog API would do — this Milo build ships only the raw quota primitives. - config/overlays/milo-integration/rbac-tenant-user.yaml — Milo RBAC for the impersonated tenant user (IPAM delegates authz to Milo). - Taskfile milo-integration:up (+ deploy-milo / deploy-ipam / provision-quota). - docs/milo-integration.md — how to run it, the full-claim walkthrough, what it covers vs the standalone e2e, and known issues. Verified end-to-end: an impersonated project-scoped IPClaim binds synchronously (status.allocatedCIDR set) AND quota is enforced — a ResourceClaim is GRANTED (reason QuotaAvailable) and the project AllowanceBucket decrements (alloc 1/100). No Go changes; internal/allocation/ stays zero-dep and the forbidden-import rules are intact. No CI workflow changes (follow-up). Co-Authored-By: Claude Opus 4.8 (1M context) --- Taskfile.yaml | 66 ++++++ .../milo-integration/kustomization.yaml | 65 ++++++ .../milo-kubeconfig-secret.yaml | 55 +++++ .../patches/apiservice-patch.yaml | 7 + .../patches/deployment-patch.yaml | 48 ++++ .../milo-egress-networkpolicy-patch.yaml | 22 ++ .../patches/volumes-patch.yaml | 29 +++ .../quota/claim-creation-policy.yaml | 44 ++++ .../quota/grant-creation-policy.yaml | 54 +++++ .../milo-integration/quota/kustomization.yaml | 19 ++ .../quota/resource-registration.yaml | 35 +++ .../milo-integration/rbac-tenant-user.yaml | 46 ++++ config/overlays/milo-integration/secret.yaml | 13 ++ .../milo-integration/tls-certificate.yaml | 22 ++ docs/milo-integration.md | 208 ++++++++++++++++++ 15 files changed, 733 insertions(+) create mode 100644 config/overlays/milo-integration/kustomization.yaml create mode 100644 config/overlays/milo-integration/milo-kubeconfig-secret.yaml create mode 100644 config/overlays/milo-integration/patches/apiservice-patch.yaml create mode 100644 config/overlays/milo-integration/patches/deployment-patch.yaml create mode 100644 config/overlays/milo-integration/patches/milo-egress-networkpolicy-patch.yaml create mode 100644 config/overlays/milo-integration/patches/volumes-patch.yaml create mode 100644 config/overlays/milo-integration/quota/claim-creation-policy.yaml create mode 100644 config/overlays/milo-integration/quota/grant-creation-policy.yaml create mode 100644 config/overlays/milo-integration/quota/kustomization.yaml create mode 100644 config/overlays/milo-integration/quota/resource-registration.yaml create mode 100644 config/overlays/milo-integration/rbac-tenant-user.yaml create mode 100644 config/overlays/milo-integration/secret.yaml create mode 100644 config/overlays/milo-integration/tls-certificate.yaml create mode 100644 docs/milo-integration.md diff --git a/Taskfile.yaml b/Taskfile.yaml index 2f66313..d812bad 100644 --- a/Taskfile.yaml +++ b/Taskfile.yaml @@ -6,6 +6,13 @@ vars: IPAM_IMAGE_TAG: "dev" TEST_INFRA_CLUSTER_NAME: "test-infra" TEST_INFRA_REPO_REF: 'v0.6.0' + # Local Milo control-plane checkout, used by the milo-integration:* targets to + # build/load/deploy Milo into the SAME test-infra kind cluster. Override with + # `MILO_REPO=/path/to/milo task milo-integration:up`. + MILO_REPO: '{{.MILO_REPO | default "../../datum-cloud/milo"}}' + # Static dev token from Milo's auth-tokens secret (admin / system:masters). + # Matches the token baked into config/overlays/milo-integration/milo-kubeconfig-secret.yaml. + MILO_TOKEN: 'test-admin-token' includes: test-infra: @@ -146,6 +153,65 @@ tasks: cmds: - task: test-infra:install-observability + # ----- Milo integration environment ----- + # + # Stands up IPAM wired to a REAL in-cluster Milo control plane so the full + # IPAM<->Milo path can be exercised locally (delegated authn/authz + quota + # enforcement + entitlement->grant->claim) — the path the standalone e2e + # (--enable-quota=false, no Milo) cannot cover. See docs/milo-integration.md. + + milo-integration:deploy-milo: + desc: Build, load and deploy the Milo control plane into the test-infra cluster + silent: true + cmds: + - | + set -e + echo ">> Deploying Milo from {{.MILO_REPO}} into the test-infra cluster" + # The remote test-infra Taskfile resolves the kubeconfig relative to the + # milo repo; mirror ours into it so `task dev:deploy` finds the cluster. + mkdir -p "{{.MILO_REPO}}/.test-infra" + cp .test-infra/kubeconfig "{{.MILO_REPO}}/.test-infra/kubeconfig" + cd "{{.MILO_REPO}}" && TASK_X_REMOTE_TASKFILES=1 task dev:build dev:load dev:deploy + + milo-integration:deploy-ipam: + desc: Deploy IPAM via the milo-integration overlay (quota ON, delegating to Milo) + silent: true + cmds: + - | + set -e + # Fresh-create the deployment: switching FROM another overlay's volume + # shape (e.g. test-infra tracing) can't be strategic-merged in place. + task test-infra:kubectl -- delete deployment ipam-apiserver -n ipam-system --ignore-not-found + task test-infra:kubectl -- apply -k config/overlays/milo-integration + task test-infra:kubectl -- wait --for=condition=ready pod -l app=ipam-apiserver -n ipam-system --timeout=240s || echo "apiserver pods not ready yet" + task test-infra:kubectl -- wait --for=condition=Available apiservice/v1alpha1.ipam.miloapis.com --timeout=180s || echo "APIService not Available yet" + + milo-integration:provision-quota: + desc: Register the IPAM quotable type + claim/grant policies in Milo + silent: true + cmds: + - | + set -e + echo ">> Provisioning IPAM quota in Milo (.milo/kubeconfig)" + KUBECONFIG="{{.MILO_REPO}}/.milo/kubeconfig" kubectl apply -k config/overlays/milo-integration/quota + # RBAC in Milo for the impersonated tenant user the e2e/demo driver uses. + KUBECONFIG="{{.MILO_REPO}}/.milo/kubeconfig" kubectl apply -f config/overlays/milo-integration/rbac-tenant-user.yaml + + milo-integration:up: + desc: Bring up the whole Milo integration env (Milo + IPAM overlay + quota provisioning) + silent: true + cmds: + - task: test-infra:cluster-up + - task: milo-integration:deploy-milo + - task: dev:build + - task: dev:load + - task: milo-integration:deploy-ipam + - task: milo-integration:provision-quota + - | + echo "" + echo "Milo integration environment is up." + echo "Drive a full claim with: docs/milo-integration.md (impersonation kubeconfig)" + # ----- E2E ----- # Generate the impersonation kubeconfig the multi-tenant / tenant-isolation diff --git a/config/overlays/milo-integration/kustomization.yaml b/config/overlays/milo-integration/kustomization.yaml new file mode 100644 index 0000000..01c4dbc --- /dev/null +++ b/config/overlays/milo-integration/kustomization.yaml @@ -0,0 +1,65 @@ +apiVersion: kustomize.config.k8s.io/v1beta1 +kind: Kustomization + +# milo-integration overlay. +# +# Stands up the IPAM apiserver wired to a real, in-cluster Milo control plane so +# the FULL IPAM<->milo path can be exercised locally: delegated authn/authz +# (TokenReview / SubjectAccessReview against milo's IAM) AND quota enforcement +# (milo's ResourceQuotaEnforcement admission plugin + the quota grant +# controllers). This is the path the standalone e2e (--enable-quota=false, no +# milo) cannot cover. +# +# Differences from the test-infra overlay: +# * --enable-quota=true (test-infra is false) +# * all three delegation kubeconfig flags point at the milo-apiserver Service +# via a mounted Secret (test-infra uses the in-cluster fallback) +# * NetworkPolicy egress opened to milo-system:6443 +# * NO tracing component (keeps the CPU/feature surface minimal; tracing is a +# test-infra concern, orthogonal to the milo path) +# +# Reuses test-infra's TLS Certificate (cert-manager, auto-approved) and the +# postgres component. Prereq: milo must already be deployed in milo-system +# (see the milo-integration:up Taskfile target / docs/milo-integration.md). + +namespace: ipam-system + +resources: + - ../../base + - secret.yaml + - tls-certificate.yaml + - milo-kubeconfig-secret.yaml + +components: + - ../../components/namespace + - ../../components/api-registration + - ../../components/postgres + +images: + - name: ghcr.io/milo-os/ipam + newName: ipam-apiserver + newTag: dev + +patches: + - path: patches/apiservice-patch.yaml + - path: patches/deployment-patch.yaml + target: + kind: Deployment + name: ipam-apiserver + - path: patches/volumes-patch.yaml + target: + kind: Deployment + name: ipam-apiserver + # Open egress from the apiserver to the milo-apiserver (milo-system:6443) so + # delegated authn/authz, the quota loopback/per-project clients, and the + # quota + APF informers that gate readyz can reach milo. Base is default-deny. + - path: patches/milo-egress-networkpolicy-patch.yaml + target: + kind: NetworkPolicy + name: ipam-apiserver + +labels: + - includeSelectors: false + includeTemplates: true + pairs: + environment: milo-integration diff --git a/config/overlays/milo-integration/milo-kubeconfig-secret.yaml b/config/overlays/milo-integration/milo-kubeconfig-secret.yaml new file mode 100644 index 0000000..a98068a --- /dev/null +++ b/config/overlays/milo-integration/milo-kubeconfig-secret.yaml @@ -0,0 +1,55 @@ +--- +# Kubeconfig the IPAM apiserver uses to delegate authn/authz to milo and to +# reach the milo quota backend. All three IPAM delegation flags +# (--kubeconfig / --authentication-kubeconfig / --authorization-kubeconfig) +# point at this single file (see patches/deployment-patch.yaml), so milo's IAM +# resolves every TokenReview / SubjectAccessReview AND the quota admission +# plugin's loopback config targets the same milo control plane. +# +# server: the in-cluster milo-apiserver Service (NOT the localhost:30443 +# gateway, which is only reachable from the host). 6443 is the port +# the milo-apiserver Service exposes (see milo config/apiserver). +# token: the static admin token from milo's auth-tokens secret +# (test-admin-token -> admin / system:masters). This is the same +# credential milo-controller-manager uses to talk to its own apiserver +# (milo-controller-manager-kubeconfig secret), so IPAM gets the same +# full-power identity — appropriate for a local integration env, NOT +# for production (production mints a scoped ServiceAccount token). +# insecure-skip-tls-verify: the milo-apiserver serves a cert-manager cert whose +# CA IPAM does not bundle here; skipping verification keeps the local +# env self-contained. The IPAM<->milo hop stays inside the cluster. +# +# This Secret is intentionally checked in with a well-known dev token because +# the whole overlay is a local integration harness, never deployed anywhere +# real. Rotate / replace with a minted SA token before reusing against a shared +# milo. +apiVersion: v1 +kind: Secret +metadata: + name: milo-kubeconfig + namespace: ipam-system + labels: + app.kubernetes.io/name: ipam + app.kubernetes.io/component: milo-integration + app.kubernetes.io/part-of: ipam.miloapis.com +type: Opaque +stringData: + kubeconfig: | + apiVersion: v1 + kind: Config + clusters: + - name: milo + cluster: + server: https://milo-apiserver.milo-system.svc.cluster.local:6443 + insecure-skip-tls-verify: true + contexts: + - name: milo + context: + cluster: milo + user: ipam + current-context: milo + preferences: {} + users: + - name: ipam + user: + token: test-admin-token diff --git a/config/overlays/milo-integration/patches/apiservice-patch.yaml b/config/overlays/milo-integration/patches/apiservice-patch.yaml new file mode 100644 index 0000000..7d4400b --- /dev/null +++ b/config/overlays/milo-integration/patches/apiservice-patch.yaml @@ -0,0 +1,7 @@ +--- +apiVersion: apiregistration.k8s.io/v1 +kind: APIService +metadata: + name: v1alpha1.ipam.miloapis.com +spec: + insecureSkipTLSVerify: true diff --git a/config/overlays/milo-integration/patches/deployment-patch.yaml b/config/overlays/milo-integration/patches/deployment-patch.yaml new file mode 100644 index 0000000..b9cc49c --- /dev/null +++ b/config/overlays/milo-integration/patches/deployment-patch.yaml @@ -0,0 +1,48 @@ +--- +# milo-integration deployment patch. +# +# Flips the IPAM apiserver from the standalone configuration (quota OFF, +# in-cluster authn/authz fallback) to the FULL milo-delegated configuration: +# +# * --enable-quota=true -> milo's ResourceQuotaEnforcement +# admission plugin runs on IPClaim CREATE +# * --kubeconfig / --authentication-kubeconfig / --authorization-kubeconfig +# -> all point at the mounted milo +# kubeconfig file, so TokenReview / +# SubjectAccessReview resolve against +# milo's IAM and the quota plugin's +# loopback config targets milo. +# +# imagePullPolicy: Never because the image is `kind load`-ed, never pulled +# (same rationale as the test-infra overlay). +apiVersion: apps/v1 +kind: Deployment +metadata: + name: ipam-apiserver +spec: + template: + spec: + initContainers: + - name: migrate + imagePullPolicy: Never + containers: + - name: apiserver + imagePullPolicy: Never + env: + - name: ENABLE_QUOTA + value: "true" + # All three delegation flags read the same mounted kubeconfig file. + - name: KUBECONFIG + value: "/var/run/ipam-apiserver/milo/kubeconfig" + - name: AUTHENTICATION_KUBECONFIG + value: "/var/run/ipam-apiserver/milo/kubeconfig" + - name: AUTHORIZATION_KUBECONFIG + value: "/var/run/ipam-apiserver/milo/kubeconfig" + # milo's TokenReview is authoritative; don't require the local host + # apiserver to also know the identity. + - name: AUTHENTICATION_TOLERATE_LOOKUP_FAILURE + value: "true" + volumeMounts: + - name: milo-kubeconfig + mountPath: /var/run/ipam-apiserver/milo + readOnly: true diff --git a/config/overlays/milo-integration/patches/milo-egress-networkpolicy-patch.yaml b/config/overlays/milo-integration/patches/milo-egress-networkpolicy-patch.yaml new file mode 100644 index 0000000..f9bb1be --- /dev/null +++ b/config/overlays/milo-integration/patches/milo-egress-networkpolicy-patch.yaml @@ -0,0 +1,22 @@ +# Allow the IPAM apiserver to reach the milo-apiserver (milo-system, :6443) for +# delegated TokenReview / SubjectAccessReview, the quota admission plugin's +# loopback + per-project control-plane clients, and the quota/APF informers that +# gate readyz. The base NetworkPolicy is default-deny on egress, so without this +# rule every milo call (and thus readyz) hangs. +# +# namespaceSelector-only (matching kubernetes.io/metadata.name) rather than a +# podSelector, for the same reason the base policy documents: the kustomization +# `labels:` transformer rewrites peer matchLabels, which would otherwise force +# milo's pods to carry IPAM labels and match nothing. +# +# JSON6902 append preserves the base egress rules (DNS, postgres, kube-apiserver). +- op: add + path: /spec/egress/- + value: + to: + - namespaceSelector: + matchLabels: + kubernetes.io/metadata.name: milo-system + ports: + - port: 6443 + protocol: TCP diff --git a/config/overlays/milo-integration/patches/volumes-patch.yaml b/config/overlays/milo-integration/patches/volumes-patch.yaml new file mode 100644 index 0000000..8c16c93 --- /dev/null +++ b/config/overlays/milo-integration/patches/volumes-patch.yaml @@ -0,0 +1,29 @@ +# JSON6902 volume rewrite for the milo-integration overlay. +# +# Two changes, applied as explicit array ops so the base default-deny / +# strategic-merge ordering can't produce a duplicate "tls-certs" volume (which +# is what happens if a CSI volume and a secret volume of the same name are +# merged from two different patches): +# +# 1. Replace volumes[0] (the base cert-manager CSI TLS volume) with the +# cert-manager Certificate Secret (ipam-tls). The built-in cert-manager +# approver auto-approves Certificate-issued CertificateRequests; the CSI +# driver's requests are NOT approved by it, which hangs the pod in Init. +# Same swap the test-infra overlay performs. +# 2. Append the milo-kubeconfig Secret volume the apiserver mounts for its +# delegation kubeconfig. +# +# volumes[0] is the TLS volume in config/base/deployment.yaml; keep this index +# in sync if the base volume order changes. +- op: replace + path: /spec/template/spec/volumes/0 + value: + name: tls-certs + secret: + secretName: ipam-tls +- op: add + path: /spec/template/spec/volumes/- + value: + name: milo-kubeconfig + secret: + secretName: milo-kubeconfig diff --git a/config/overlays/milo-integration/quota/claim-creation-policy.yaml b/config/overlays/milo-integration/quota/claim-creation-policy.yaml new file mode 100644 index 0000000..f654ee2 --- /dev/null +++ b/config/overlays/milo-integration/quota/claim-creation-policy.yaml @@ -0,0 +1,44 @@ +# Auto-creates a ResourceClaim whenever an IPClaim is created, so the IPAM +# apiserver's quota admission plugin has a claim to create-and-wait-for-grant. +# +# The IPAM quota admission plugin (milo's ResourceQuotaEnforcement, run inside +# the IPAM apiserver) looks up the ClaimCreationPolicy whose trigger GVK matches +# the resource being created (IPClaim), renders this template, creates the +# ResourceClaim in the project control-plane, and blocks the IPClaim CREATE +# until that ResourceClaim is Granted (or denies on insufficient quota). The +# consumerRef is auto-filled by the plugin from the project request context. +apiVersion: quota.miloapis.com/v1alpha1 +kind: ClaimCreationPolicy +metadata: + name: project-ipclaim-claim-policy + labels: + app.kubernetes.io/name: ipam + app.kubernetes.io/component: milo-integration +spec: + disabled: false + trigger: + resource: + apiVersion: ipam.miloapis.com/v1alpha1 + kind: IPClaim + # No constraints - trigger for every IPClaim. + target: + resourceClaimTemplate: + metadata: + # Use requestInfo.name (the admission attrs name), NOT + # trigger.metadata.name: the IPClaim object the IPAM aggregated apiserver + # hands the quota plugin is converted from an internal Go type and does + # not expose a usable `metadata` key to the CEL template engine, so + # `trigger.metadata.name` fails to render ("no such key: metadata"). + # requestInfo.name resolves to the same IPClaim name and is always set. + name: "ipclaim-{{requestInfo.name}}" + namespace: "{{requestInfo.namespace}}" + labels: + app.kubernetes.io/name: ipam + app.kubernetes.io/component: milo-integration + annotations: + kubernetes.io/description: "Automatic quota claim for IPClaim creation" + spec: + # consumerRef auto-filled by the admission plugin from project context. + requests: + - resourceType: ipam.miloapis.com/ipclaims + amount: 1 diff --git a/config/overlays/milo-integration/quota/grant-creation-policy.yaml b/config/overlays/milo-integration/quota/grant-creation-policy.yaml new file mode 100644 index 0000000..a8ec5f2 --- /dev/null +++ b/config/overlays/milo-integration/quota/grant-creation-policy.yaml @@ -0,0 +1,54 @@ +# Auto-grants every Project an IPClaim allowance when the Project is created. +# +# The GrantCreation controller (in milo-controller-manager) watches Projects; +# when one is created it renders this template into a ResourceGrant placed in +# that project's control-plane. The grant becoming Active pre-creates an +# AllowanceBucket (limit = bucket amount) the IPClaim ResourceClaims draw from. +# +# Without this, a project has no allowance for ipam.miloapis.com/ipclaims and +# every IPClaim CREATE is denied — exactly the staging blocker (ResourceClaims +# never granted because the project had no allowance/grant for the metric). +apiVersion: quota.miloapis.com/v1alpha1 +kind: GrantCreationPolicy +metadata: + name: project-ipclaim-grant-policy + labels: + app.kubernetes.io/name: ipam + app.kubernetes.io/component: milo-integration +spec: + disabled: false + trigger: + resource: + apiVersion: resourcemanager.miloapis.com/v1alpha1 + kind: Project + # No constraints - grant to every project. + target: + # Route the grant (and the AllowanceBucket it pre-creates) INTO the project's + # own control-plane, not the core cluster. The IPAM quota admission plugin + # looks up the ResourceClaim/AllowanceBucket in the per-project control-plane + # when it enforces an IPClaim CREATE, so the grant must live there. Omitting + # parentContext writes the grant to the core cluster, where the project's + # claims never see it (the staging failure mode). + parentContext: + apiGroup: resourcemanager.miloapis.com + kind: Project + nameExpression: "trigger.metadata.name" + resourceGrantTemplate: + metadata: + name: "ipam-grant-{{.trigger.metadata.name}}" + namespace: "milo-system" + labels: + app.kubernetes.io/name: ipam + app.kubernetes.io/component: milo-integration + quota.miloapis.com/auto-created: "true" + annotations: + quota.miloapis.com/description: "Auto IPClaim quota for project {{.trigger.metadata.name}}" + spec: + consumerRef: + apiGroup: resourcemanager.miloapis.com + kind: Project + name: "{{.trigger.metadata.name}}" + allowances: + - resourceType: ipam.miloapis.com/ipclaims + buckets: + - amount: 100 diff --git a/config/overlays/milo-integration/quota/kustomization.yaml b/config/overlays/milo-integration/quota/kustomization.yaml new file mode 100644 index 0000000..b157815 --- /dev/null +++ b/config/overlays/milo-integration/quota/kustomization.yaml @@ -0,0 +1,19 @@ +apiVersion: kustomize.config.k8s.io/v1beta1 +kind: Kustomization + +# IPAM quota provisioning, applied to the MILO control plane (NOT the IPAM +# apiserver cluster). These are quota.miloapis.com primitives that register the +# IPAM quotable resource type and wire the auto-claim / auto-grant policies so a +# project gets an IPClaim allowance and IPClaim creates are quota-enforced. +# +# Apply with the milo kubeconfig: +# KUBECONFIG=/.milo/kubeconfig kubectl apply -k config/overlays/milo-integration/quota +# (the milo-integration:up Taskfile target does this for you). +# +# NOTE: ordering matters — ResourceRegistration must exist before the policies +# reference its resourceType. kubectl apply -k applies all at once; the milo +# controllers reconcile to a consistent state regardless of apply order. +resources: + - resource-registration.yaml + - grant-creation-policy.yaml + - claim-creation-policy.yaml diff --git a/config/overlays/milo-integration/quota/resource-registration.yaml b/config/overlays/milo-integration/quota/resource-registration.yaml new file mode 100644 index 0000000..3c06076 --- /dev/null +++ b/config/overlays/milo-integration/quota/resource-registration.yaml @@ -0,0 +1,35 @@ +# Registers the quotable IPAM resource type with milo's quota system. +# +# resourceType is the internal metric key the IPClaim ResourceClaims request +# against (see claim-creation-policy.yaml) and that grants/buckets allocate from +# (see grant-creation-policy.yaml). It mirrors the metric the service-catalog +# ServiceConfiguration would emit (ipam.miloapis.com/ipclaims) — but THIS milo +# build ships the raw quota.miloapis.com primitives, not the higher-level +# services.miloapis.com catalog API, so we register directly. +# +# consumerType Project: quota is held and enforced per resourcemanager Project, +# matching how the IPAM apiserver scopes IPClaim requests (project context -> +# per-project control-plane). +apiVersion: quota.miloapis.com/v1alpha1 +kind: ResourceRegistration +metadata: + name: ipclaims-per-project + labels: + app.kubernetes.io/name: ipam + app.kubernetes.io/component: milo-integration + annotations: + kubernetes.io/display-name: "IP Claims" + kubernetes.io/description: "Maximum number of IPClaim objects per project" +spec: + consumerType: + apiGroup: resourcemanager.miloapis.com + kind: Project + type: Entity + resourceType: ipam.miloapis.com/ipclaims + description: "Maximum number of IPClaim objects per project" + baseUnit: ipclaim + displayUnit: ipclaims + unitConversionFactor: 1 + claimingResources: + - apiGroup: ipam.miloapis.com + kind: IPClaim diff --git a/config/overlays/milo-integration/rbac-tenant-user.yaml b/config/overlays/milo-integration/rbac-tenant-user.yaml new file mode 100644 index 0000000..985380f --- /dev/null +++ b/config/overlays/milo-integration/rbac-tenant-user.yaml @@ -0,0 +1,46 @@ +# RBAC applied to the MILO control plane (NOT the IPAM cluster) authorizing the +# impersonated tenant user that the demo / e2e driver uses to create +# project-scoped IPPools and IPClaims. +# +# Why in Milo: with the milo-integration overlay, IPAM delegates authorization +# to milo (--authorization-kubeconfig -> milo-apiserver), so every IPAM +# SubjectAccessReview is answered by milo's RBAC, not the local kube-apiserver. +# The impersonation itself (--as / --as-user-extra) is authorized by the LOCAL +# kube-apiserver where the IPAM APIService is aggregated; the resulting user +# (e2e-tenant-tester) must then be granted IPAM verbs HERE in milo. +# +# e2e-tenant-tester matches the impersonation identity in +# test/e2e/.tenant-impersonation-*.kubeconfig and docs/milo-integration.md. +apiVersion: rbac.authorization.k8s.io/v1 +kind: ClusterRole +metadata: + name: ipam-tenant-user + labels: + app.kubernetes.io/name: ipam + app.kubernetes.io/component: milo-integration +rules: + - apiGroups: ["ipam.miloapis.com"] + resources: + - ippools + - ippools/status + - ipclaims + - ipclaims/status + - ipallocations + - ipallocations/status + verbs: ["get", "list", "watch", "create", "update", "patch", "delete"] +--- +apiVersion: rbac.authorization.k8s.io/v1 +kind: ClusterRoleBinding +metadata: + name: ipam-tenant-user-binding + labels: + app.kubernetes.io/name: ipam + app.kubernetes.io/component: milo-integration +roleRef: + apiGroup: rbac.authorization.k8s.io + kind: ClusterRole + name: ipam-tenant-user +subjects: + - kind: User + apiGroup: rbac.authorization.k8s.io + name: e2e-tenant-tester diff --git a/config/overlays/milo-integration/secret.yaml b/config/overlays/milo-integration/secret.yaml new file mode 100644 index 0000000..86eabc2 --- /dev/null +++ b/config/overlays/milo-integration/secret.yaml @@ -0,0 +1,13 @@ +apiVersion: v1 +kind: Secret +metadata: + name: postgres-credentials + namespace: ipam-system + labels: + app.kubernetes.io/name: postgres + app.kubernetes.io/component: database + app.kubernetes.io/part-of: ipam.miloapis.com +type: Opaque +stringData: + dsn: "postgres://ipam:devpassword@postgres-postgresql.ipam-system.svc.cluster.local:5432/ipam?sslmode=disable" + password: "devpassword" diff --git a/config/overlays/milo-integration/tls-certificate.yaml b/config/overlays/milo-integration/tls-certificate.yaml new file mode 100644 index 0000000..8ee6234 --- /dev/null +++ b/config/overlays/milo-integration/tls-certificate.yaml @@ -0,0 +1,22 @@ +--- +# cert-manager Certificate (not CSI driver) so the built-in cert-manager +# approver auto-approves the CertificateRequest and writes the TLS secret. +# The CSI driver's CertificateRequests are not approved by the built-in +# approver, causing pods to hang in Init:0/1. +apiVersion: cert-manager.io/v1 +kind: Certificate +metadata: + name: ipam-tls + namespace: ipam-system +spec: + secretName: ipam-tls + duration: 24h + renewBefore: 1h + dnsNames: + - ipam-apiserver + - ipam-apiserver.ipam-system + - ipam-apiserver.ipam-system.svc + - ipam-apiserver.ipam-system.svc.cluster.local + issuerRef: + name: selfsigned-cluster-issuer + kind: ClusterIssuer diff --git a/docs/milo-integration.md b/docs/milo-integration.md new file mode 100644 index 0000000..88bd420 --- /dev/null +++ b/docs/milo-integration.md @@ -0,0 +1,208 @@ +# Milo integration environment + +A local environment that runs the IPAM apiserver wired to a **real, in-cluster +Milo control plane**, so the full IPAM↔Milo path can be exercised end-to-end: + +- **Delegated authn/authz** — IPAM resolves TokenReview / SubjectAccessReview + against Milo's IAM (not the local kube-apiserver). +- **Quota enforcement** — Milo's `ResourceQuotaEnforcement` admission plugin runs + inside the IPAM apiserver on `IPClaim` CREATE. +- **Entitlement → grant → claim** — a project gets an allowance, an `IPClaim` + CREATE auto-creates a `ResourceClaim`, the claim is **granted** against an + `AllowanceBucket`, and only then does the IPClaim bind. + +This is the path the **standalone e2e cannot cover**: that suite runs IPAM with +`--enable-quota=false` and no Milo, so it tests allocation math and tenant +key-prefixing but never the quota admission plugin, the per-project control-plane +routing, or delegated authz. + +## What it covers vs the standalone e2e + +| Capability | Standalone e2e (`test/e2e`) | Milo integration env | +|---|---|---| +| CIDR/ASN allocation math | ✅ | ✅ | +| Tenant key-prefix isolation | ✅ (impersonation extras) | ✅ | +| `--enable-quota` admission plugin | ❌ (off) | ✅ (on) | +| Delegated authn/authz to Milo IAM | ❌ (local kube fallback) | ✅ | +| `readyz` with quota + APF informers syncing from Milo | ❌ | ✅ | +| ResourceClaim auto-creation + grant | ❌ | ✅ | +| AllowanceBucket decrement on claim | ❌ | ✅ | + +## Prerequisites + +- A running test-infra kind cluster (`task test-infra:cluster-up`). +- A local Milo checkout. Default path `../../datum-cloud/milo`; override with + `MILO_REPO=/path/to/milo`. +- `docker`, `kind`, `kubectl`, `kustomize`, `task` (with `TASK_X_REMOTE_TASKFILES=1`). + +> **Resource note.** Milo (etcd + apiserver + controller-manager + argo-events) +> needs ~2 CPU on top of IPAM. On a 4-CPU kind node you may need to scale down +> the optional `telemetry-system` observability stack first: +> `kubectl scale --replicas=0 -n telemetry-system deploy,statefulset --all`. + +## Run it + +```bash +# One shot: cluster + Milo + IPAM (quota ON) + quota provisioning. +MILO_REPO=../../datum-cloud/milo task milo-integration:up +``` + +Or step by step: + +```bash +task milo-integration:deploy-milo # build/load/deploy Milo into test-infra +task dev:build dev:load # build/load the IPAM image +task milo-integration:deploy-ipam # deploy IPAM via the milo-integration overlay +task milo-integration:provision-quota # register the IPAM quotable type + policies in Milo +``` + +`milo-integration:deploy-ipam` deletes and recreates the IPAM Deployment because +switching from another overlay's volume shape (e.g. test-infra's tracing volume) +cannot be strategic-merged in place. + +### Verify IPAM is ready with quota ON + +```bash +kubectl get pods -n ipam-system +kubectl get apiservice v1alpha1.ipam.miloapis.com \ + -o jsonpath='{.status.conditions[?(@.type=="Available")].status}' # True +``` + +The IPAM apiserver log should show the quota loopback config injected, the +ClaimCreationPolicies synced from Milo, and **FlowSchema + PriorityLevelConfiguration +caches populated** — the APF readyz dependency that historically blocked staging. + +## Drive a full claim (end-to-end proof) + +Project scope reaches IPAM via the `iam.miloapis.com/parent-*` impersonation +userextras (the same mechanism Milo's front gate uses). Create an Organization +and a Project in Milo, then create a project-scoped IPPool + IPClaim through the +local kube-apiserver (where the IPAM APIService is aggregated). + +```bash +MILO_KUBECONFIG=../../datum-cloud/milo/.milo/kubeconfig # admin / test-admin-token + +# 1. Organization (core control-plane) +KUBECONFIG=$MILO_KUBECONFIG kubectl apply -f - <<'EOF' +apiVersion: resourcemanager.miloapis.com/v1alpha1 +kind: Organization +metadata: { name: ipam-int-org } +spec: { type: Standard } +EOF + +# 2. Project (org control-plane sub-path) +cat > /tmp/org.kubeconfig < {"allocatedCIDR":"10.200.0.0/24","boundAllocationRef":{...},"phase":"Bound"} + +# A ResourceClaim was GRANTED in the project control-plane (not bypassed) +cat > /tmp/proj.kubeconfig < limit=100 alloc=1 avail=99 +``` + +## How it's wired + +- **`config/overlays/milo-integration/`** — IPAM overlay. Flips `--enable-quota=true`, + points all three delegation kubeconfig flags at the in-cluster `milo-apiserver` + Service via the `milo-kubeconfig` Secret, and opens NetworkPolicy egress to + `milo-system:6443`. No tracing component (keeps the footprint minimal). +- **`config/overlays/milo-integration/quota/`** — applied to **Milo**, not the IPAM + cluster. Registers the quotable type and the auto-claim / auto-grant policies: + - `ResourceRegistration` `ipclaims-per-project` (`resourceType: ipam.miloapis.com/ipclaims`) + - `ClaimCreationPolicy` triggered by `IPClaim` → creates a `ResourceClaim` + - `GrantCreationPolicy` triggered by `Project` → creates a `ResourceGrant` **in + the project control-plane** (via `target.parentContext`), pre-creating an + `AllowanceBucket`. +- **`config/overlays/milo-integration/rbac-tenant-user.yaml`** — applied to **Milo**. + Grants the impersonated `e2e-tenant-tester` IPAM verbs (IPAM delegates authz to + Milo, so the grant must live in Milo). + +## Why this Milo build differs from staging + +In this Milo, the quota grant controllers (`resource-registration`, +`resource-grant`, `resource-claim`, `allowance-bucket`, `claim-creation-policy`, +`grant-creation-policy`, `grant-creation-controller`, …) run **inside the single +`milo-controller-manager`** via its multicluster manager — there is **no separate +`services-controller-manager`** as in staging. `task dev:deploy` therefore brings +up the entire grant pipeline on its own. + +Note also: this Milo build ships only the raw `quota.miloapis.com` primitives, +**not** the higher-level `services.miloapis.com` catalog API. The pre-existing +`config/components/service-catalog/` component (Service / ServiceConfiguration) +cannot be applied here — its function is reproduced directly with the +`quota.miloapis.com` ResourceRegistration + policies in +`config/overlays/milo-integration/quota/`. + +## Known issues observed in this env + +- **`ClaimCreationPolicy` name template must use `requestInfo.name`, not + `trigger.metadata.name`.** The IPClaim object the IPAM aggregated apiserver + hands Milo's quota plugin is converted from an internal Go type and does not + expose a usable `metadata` key to the CEL template engine, so + `trigger.metadata.name` fails to render. `requestInfo.name` is the same value + and always set. (Related: IPAM logs `[SHOULD NOT HAPPEN] failed to update + managedFields ... no type found matching IPClaimSpec`, a structured-merge-diff + schema-registration gap on the IPAM side.) +- **Intermittent `fatal error: concurrent map writes` on the claim path.** The + IPAM apiserver can crash once under the quota admission + watch-manager path — + the same unsynchronised-map / heap-instability failure mode documented around + the `MaxConns=10` cap in `cmd/ipam/serve.go`. The pod restarts and the retried + claim succeeds. This is an IPAM runtime bug, not a config issue.