From 68265d61322d7d53910323455c01a8309e856335 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Tue, 16 Jun 2026 23:03:37 +0000 Subject: [PATCH 01/12] Initial plan From 3aad1705f8ba4ce7631097acd9a39910da48b759 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Tue, 16 Jun 2026 23:08:20 +0000 Subject: [PATCH 02/12] Validate NPD in e2e flow --- hack/e2e/README.md | 4 +-- hack/e2e/lib/validate.sh | 65 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 67 insertions(+), 2 deletions(-) diff --git a/hack/e2e/README.md b/hack/e2e/README.md index c328a0cb..bbcfa15c 100644 --- a/hack/e2e/README.md +++ b/hack/e2e/README.md @@ -29,7 +29,7 @@ The default `all` command runs: 1. Build the local `aks-flex-node` binary unless `--binary` or `--skip-build` is used. 2. Deploy AKS and three VMs with Bicep. 3. Join all three VMs. -4. Validate node readiness and run smoke workloads. +4. Validate node readiness, node-problem-detector status, and run smoke workloads. 5. Unjoin all Flex Nodes and verify they are absent. 6. Rejoin all Flex Nodes and validate again. 7. Run local-machine-driven repave validation. @@ -51,7 +51,7 @@ The default `all` command runs: | `unjoin-msi` | Unjoin only the managed-identity node. | | `unjoin-token` | Unjoin only the bootstrap-token node. | | `unjoin-kubeadm` | Unjoin only the kubeadm-style node. | -| `validate` | Verify joined nodes and run smoke tests. | +| `validate` | Verify joined nodes, node-problem-detector status, and run smoke tests. | | `validate-absent` | Verify Flex Node objects are absent after unjoin. | | `smoke` | Run smoke workloads only. | | `upgrade-drift` | Validate local-machine-driven repave to the alternate nspawn side. | diff --git a/hack/e2e/lib/validate.sh b/hack/e2e/lib/validate.sh index 3bf60097..c53ba2b8 100755 --- a/hack/e2e/lib/validate.sh +++ b/hack/e2e/lib/validate.sh @@ -5,6 +5,7 @@ # Functions: # validate_node_joined - Wait for a specific node to appear in kubectl # validate_all_nodes - Verify MSI, token, and kubeadm nodes joined +# validate_npd_status - Verify node-problem-detector is active # validate_node_absent - Wait for a node to disappear from kubectl # validate_all_nodes_absent - Verify all flex nodes are gone after unjoin # smoke_test