GCP-841: remove ClusterResourceSet feature gate from CAPG manager args#8795
Conversation
ClusterResourceSet was promoted to GA in CAPI 1.10 and removed entirely in CAPI 1.12. OCP 4.22+ ships CAPG built against CAPI 1.12.8, causing the capi-provider pod to crash at startup with: invalid argument "MachinePool=false,ClusterResourceSet=false" for "--feature-gates" flag: unrecognized feature gate: ClusterResourceSet Fixes: GCP-841 Signed-off-by: Cristiano Veiga <cveiga@redhat.com> Commit-Message-Assisted-by: Claude (via Claude Code)
|
Skipping CI for Draft Pull Request. |
|
Pipeline controller notification For optional jobs, comment This repository is configured in: LGTM mode |
📝 WalkthroughWalkthroughIn 🚥 Pre-merge checks | ✅ 11✅ Passed checks (11 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
@cristianoveiga: This pull request references GCP-841 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "5.0.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #8795 +/- ##
==========================================
- Coverage 42.09% 42.09% -0.01%
==========================================
Files 766 766
Lines 95047 95043 -4
==========================================
- Hits 40012 40008 -4
Misses 52221 52221
Partials 2814 2814
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
|
/test e2e-v2-gke |
|
@cristianoveiga hypershift is still on CAPI 1.11, since you are removing a feature that is still there on that version we need to make sure it is fine. |
Hi @clebs, The deployed CAPG binary comes from the OCP payload image, built separately from HyperShift's own vendor. My understanding is that these versions are not required to match. The OpenShift CAPG fork upgraded to CAPI 1.12.8 in openshift/cluster-api-provider-gcp@e049bbd, and the new payloads (GCP HCP minimum will be 4.23) ship that binary. ClusterResourceSet doesn't exist in any supported CAPG binary, so the fix is safe. |
|
@cristianoveiga I see, if older CAPG versions that are still on CAPI 1.11 do not have that either, it should work fine. /lgtm |
|
Scheduling tests matching the |
AI Test Failure AnalysisJob: Generated by hypershift-analyze-e2e-failure post-step using Claude claude-opus-4-6 |
|
/retest-required |
AI Test Failure AnalysisJob: Generated by hypershift-analyze-e2e-failure post-step using Claude claude-opus-4-6 |
|
/retest-required |
AI Test Failure AnalysisJob: Generated by hypershift-analyze-e2e-failure post-step using Claude claude-opus-4-6 |
|
/retest-required |
AI Test Failure AnalysisJob: Generated by hypershift-analyze-e2e-failure post-step using Claude claude-opus-4-6 |
|
/test e2e-aks |
|
/verified later by @cristianoveiga |
|
@cristianoveiga: Only users can be targets for the DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/verified bypass |
|
@cristianoveiga: The DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
AI Test Failure AnalysisJob: Generated by hypershift-analyze-e2e-failure post-step using Claude claude-opus-4-6 |
|
/test e2e-aks |
|
The Test Failure Analysis CompleteJob Information
Test Failure AnalysisErrorSummaryThis failure is a CI infrastructure scheduling issue completely unrelated to the PR's code changes. The job failed before any test code ever executed. The Root CauseThe root cause is CI build cluster resource exhaustion / scheduling constraints on Across the 78 scheduling events recorded, the scheduler consistently could not place the pod because:
The multiarch-tuning-operator processed the pod correctly (gated it, detected This is a transient infrastructure condition. The actual test step The PR changes (removing the Recommendations
Evidence
|
|
/test e2e-aks |
|
[APPROVALNOTIFIER] This PR is APPROVED Approval requirements bypassed by manually added approval. This pull-request has been approved by: cristianoveiga The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
@cristianoveiga: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Summary
ClusterResourceSet=falsefrom the--feature-gatesarg passed to the CAPG managerClusterResourceSetwas promoted to GA in CAPI 1.10 and removed in CAPI 1.12 (kubernetes-sigs/cluster-api#12950)capi-providerpod to crash at startup with:unrecognized feature gate: ClusterResourceSetMachinePool=falseis retained — still valid in CAPI 1.12 (Beta, default-on)Fixes: https://redhat.atlassian.net/browse/GCP-841
Test plan
go test ./hypershift-operator/controllers/hostedcluster/internal/platform/gcp/)periodic-ci-openshift-hypershift-release-4.23-periodics-e2e-v2-gkeno longer fails due to capi-provider crashcapi-providerpod starts successfully on 4.22.x and 4.23.x without a CAPG image override🤖 Generated with Claude Code
Summary by CodeRabbit