OCPBUGS-92817: delete MAPI MachineSets before CAPI in e2e cleanup#611
OCPBUGS-92817: delete MAPI MachineSets before CAPI in e2e cleanup#611pmeida wants to merge 1 commit into
Conversation
|
Pipeline controller notification For optional jobs, comment This repository is configured in: LGTM mode |
|
@pmeida: This pull request references Jira Issue OCPBUGS-92817, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@pmeida: This pull request references Jira Issue OCPBUGS-92817, which is valid. 3 validation(s) were run on this bug
DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository YAML (base), Central YAML (inherited) Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (1)
WalkthroughModifies ChangesCleanup ordering change
Estimated code review effort: 1 (Trivial) | ~5 minutes Suggested labels: Suggested reviewers: 🚥 Pre-merge checks | ✅ 15✅ Passed checks (15 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
This fixes the test's cleanup order to use the supported deletion path (MAPI first). When MAPI is deleted first, The root cause of the deadlock is in |
|
/test e2e-aws-capi-techpreview |
|
/assign @theobarberbany |
|
@pmeida: This pull request references Jira Issue OCPBUGS-92817, which is valid. 3 validation(s) were run on this bug
The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/verified by e2e-aws-capi-techpreview |
|
@pmeida: This PR has been marked as verified by DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/lgtm nice one :) |
|
Scheduling tests matching the |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: theobarberbany The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
When a test creates both a CAPI MachineSet and a MAPI MachineSet with the same name and authoritativeAPI: ClusterAPI, the sync controller manages deletion through reconcileCAPItoMAPIMachineSetDeletionNormal. Deleting CAPI first causes the sync controller to issue deletion to MAPI and then loop waiting for the CAPI-specific finalizer (cluster.x-k8s.io/machineset) to be removed. The sync controller's constant requeues conflict with the CAPI controller's finalizer removal patch, causing a deadlock. Deleting MAPI first triggers reconcileCAPItoMAPIMachineSetDeletionCAPINotDeleting which removes the sync finalizer from CAPI immediately. The CAPI MachineSet can then be deleted cleanly with only its own finalizer to manage. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
|
New changes are detected. LGTM label has been removed. |
|
@pmeida: This pull request references Jira Issue OCPBUGS-92817, which is valid. 3 validation(s) were run on this bug
DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@pmeida: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
/retest-required |
|
/pipeline required |
|
Scheduling tests matching the |
Summary
Fixes a deadlock in
cleanupMachineSetTestResourcesthat causese2e-aws-capi-techpreviewto fail with a 15-minute timeout.When a test creates both a CAPI MachineSet and a MAPI MachineSet with the same name and
authoritativeAPI: ClusterAPI, deleting CAPI first causes the sync controller to loop inreconcileCAPItoMAPIMachineSetDeletionNormal- it waits for the CAPI-specific finalizer (cluster.x-k8s.io/machineset) to be removed, but its own constant requeues conflict with the CAPI controller's finalizer removal patch, deadlocking cleanup.Deleting MAPI first instead triggers
reconcileCAPItoMAPIMachineSetDeletionCAPINotDeleting, which removes the sync finalizer from CAPI immediately. The CAPI MachineSet can then be deleted cleanly with no sync interference.Test plan
e2e-aws-capi-techpreviewpasses without theShould have deleted MachineSet openshift-cluster-api/capi-ms-auth-capi-*timeoutFixes: https://issues.redhat.com/browse/OCPBUGS-92817
Summary by CodeRabbit