Skip to content

OCPBUGS-84303: fix(api): add IPv6 OVN join subnet config to prevent dual-stack routing collision#8421

Open
orenc1 wants to merge 1 commit into
openshift:mainfrom
orenc1:fix_dual_stack__ovn_joinsubnet_collision
Open

OCPBUGS-84303: fix(api): add IPv6 OVN join subnet config to prevent dual-stack routing collision#8421
orenc1 wants to merge 1 commit into
openshift:mainfrom
orenc1:fix_dual_stack__ovn_joinsubnet_collision

Conversation

@orenc1
Copy link
Copy Markdown
Contributor

@orenc1 orenc1 commented May 5, 2026

When a KubeVirt hosted cluster and its management cluster both use OVN-Kubernetes with dual-stack networking, they each default to fd98::/64 for the IPv6 join switch subnet. External IPv6 LoadBalancer traffic targeting VM pods is SNAT'd to the management cluster's join IP (e.g. fd98::2). Inside the VM, the guest cluster's OVN intercepts the response because it also owns fd98::/64, black-holing the packet.

This commit fixes the issue in two ways:

  1. Automatic KubeVirt default: for KubeVirt hosted clusters with OVNKubernetes, the reconciler now sets IPv6.InternalJoinSubnet to fd99::/64 by default, avoiding the collision with the management cluster's fd98::/64. This mirrors the existing V4InternalSubnet override (100.66.0.0/16) already in place for IPv4.

  2. User-facing API: adds OVNIPv6Config type to OVNKubernetesConfig, allowing explicit configuration of IPv6 internalJoinSubnet and internalTransitSwitchSubnet for any platform. This maps to the upstream operatorv1.IPv6OVNKubernetesConfig and includes IPv6 CIDR format validation via CEL rules.

Also extends CIDR overlap validation in the HostedCluster webhook to cover IPv6 OVN subnets, and adds envtest CRD validation cases.

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes https://redhat.atlassian.net/browse/OCPBUGS-84303

Special notes for your reviewer:

Checklist:

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

Summary by CodeRabbit

  • New Features

    • OVN-Kubernetes now exposes IPv6 internal transit and join subnet settings and will default a KubeVirt join subnet when the cluster has IPv6 network(s); operator-provided IPv6 values can override the default.
  • Bug Fixes / Validation

    • Validation added to prevent identical IPv6 join/transit subnets, enforce IPv6 CIDR/prefix format and allowed prefix range, and include OVN IPv6 (and KubeVirt defaults) in overlap checks.
  • Tests

    • Expanded coverage for IPv6-only and mixed IPv4+IPv6 scenarios, defaulting, overrides, and overlap failures.

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: LGTM mode

@openshift-ci-robot openshift-ci-robot added jira/severity-important Referenced Jira bug's severity is important for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels May 5, 2026
@openshift-ci-robot
Copy link
Copy Markdown

@orenc1: This pull request references Jira Issue OCPBUGS-84303, which is invalid:

  • expected the bug to target the "5.0.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

When a KubeVirt hosted cluster and its management cluster both use OVN-Kubernetes with dual-stack networking, they each default to fd98::/64 for the IPv6 join switch subnet. External IPv6 LoadBalancer traffic targeting VM pods is SNAT'd to the management cluster's join IP (e.g. fd98::2). Inside the VM, the guest cluster's OVN intercepts the response because it also owns fd98::/64, black-holing the packet.

This commit fixes the issue in two ways:

  1. Automatic KubeVirt default: for KubeVirt hosted clusters with OVNKubernetes, the reconciler now sets IPv6.InternalJoinSubnet to fd99::/64 by default, avoiding the collision with the management cluster's fd98::/64. This mirrors the existing V4InternalSubnet override (100.66.0.0/16) already in place for IPv4.

  2. User-facing API: adds OVNIPv6Config type to OVNKubernetesConfig, allowing explicit configuration of IPv6 internalJoinSubnet and internalTransitSwitchSubnet for any platform. This maps to the upstream operatorv1.IPv6OVNKubernetesConfig and includes IPv6 CIDR format validation via CEL rules.

Also extends CIDR overlap validation in the HostedCluster webhook to cover IPv6 OVN subnets, and adds envtest CRD validation cases.

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes https://redhat.atlassian.net/browse/OCPBUGS-84303

Special notes for your reviewer:

Checklist:

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 5, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds IPv6 support for OVN-Kubernetes: a new constant KubevirtDefaultV6InternalJoinSubnet, an exported OVNIPv6Config type with IPv6 CIDR and prefix validations plus an XValidation preventing equal IPv6 subnets, and an ipv6 field on OVNKubernetesConfig. Reconciler APIs gain a hasIPv6Network input; the network reconciler defaults and propagates OVN IPv6 subnets (including KubeVirt defaulting) into the operator network spec. validateSliceNetworkCIDRs now includes OVN IPv6 internal subnets for overlap checks. Tests updated to cover defaulting, propagation, and IPv6 overlap scenarios.

Sequence Diagram(s)

sequenceDiagram
    participant HCP as HostedControlPlane
    participant Detector as hasIPv6Network
    participant Reconciler as ReconcileNetworkOperator
    participant OVNConfig as OVN Config
    participant DefaultNet as DefaultNetwork.OVNKubernetesConfig
    participant Validator as validateSliceNetworkCIDRs

    HCP->>Detector: provide HCP.Spec.Networking
    Detector-->>Reconciler: hasIPv6Network (true/false)
    HCP->>Reconciler: submit HostedCluster + platform + ovnConfig
    Reconciler->>Reconciler: check platform == KubeVirt && network == OVNKubernetes
    alt hasIPv6Network = true and IPv6 not set
        Reconciler->>DefaultNet: set IPv6.InternalJoinSubnet = KubevirtDefaultV6InternalJoinSubnet
    end
    OVNConfig->>Reconciler: provide ovnConfig.IPv6.* (if present)
    Reconciler->>DefaultNet: copy OVN IPv4/IPv6 fields into operator spec
    DefaultNet->>Validator: include OVN IPv6 internal subnets in overlap checks
    Validator-->>DefaultNet: validation result
    DefaultNet-->>Reconciler: apply or reject network configuration
Loading
🚥 Pre-merge checks | ✅ 10 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 16.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Test Structure And Quality ⚠️ Warning Assertions lack meaningful failure messages. reconcile_test.go line 610 uses g.Expect(tc.inputNetwork).To(BeEquivalentTo(tc.expectedNetwork)) without diagnostic message. Add message parameter to Gomega assertion: g.Expect(tc.inputNetwork).To(BeEquivalentTo(tc.expectedNetwork), "reconcile network operator failed for: %s", tc.name) to help diagnose failures.
✅ Passed checks (10 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically summarizes the main change: adding IPv6 OVN join subnet configuration to prevent dual-stack routing collisions, which is the primary objective of this changeset.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed All test names in the PR are stable and deterministic. 38 total test cases across 2 test files use static, descriptive strings with no dynamic content, variables, or format strings.
Microshift Test Compatibility ✅ Passed No new Ginkgo e2e tests were added in this PR. The changes consist of API type definitions and standard Go unit tests, which are outside the scope of this MicroShift compatibility check.
Single Node Openshift (Sno) Test Compatibility ✅ Passed No Ginkgo e2e tests were added in this PR. The test changes consist only of standard Go unit tests using testing.T and Gomega assertions. The check is not applicable.
Topology-Aware Scheduling Compatibility ✅ Passed Changes add IPv6 OVN network configuration and CIDR validation. No scheduling constraints, affinity rules, control-plane nodeSelectors, or topology-dependent scheduling introduced.
Ote Binary Stdout Contract ✅ Passed PR modifies only types, constants, helper functions, and tests with no stdout writes at process level. All changes maintain OTE Binary Stdout Contract compliance.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed No Ginkgo e2e tests added in this PR. Only standard Go unit tests were added with comprehensive dual-stack IPv4/IPv6 coverage and no external connectivity requirements.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci Bot requested review from bryan-cox and devguyio May 5, 2026 12:21
@openshift-ci openshift-ci Bot added area/api Indicates the PR includes changes for the API area/cli Indicates the PR includes changes for CLI area/control-plane-operator Indicates the PR includes changes for the control plane operator - in an OCP release area/documentation Indicates the PR includes changes for documentation area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release and removed do-not-merge/needs-area labels May 5, 2026
@openshift-ci-robot openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels May 5, 2026
@openshift-ci-robot
Copy link
Copy Markdown

@orenc1: This pull request references Jira Issue OCPBUGS-84303, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (5.0.0) matches configured target version for branch (5.0.0)
  • bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)
Details

In response to this:

When a KubeVirt hosted cluster and its management cluster both use OVN-Kubernetes with dual-stack networking, they each default to fd98::/64 for the IPv6 join switch subnet. External IPv6 LoadBalancer traffic targeting VM pods is SNAT'd to the management cluster's join IP (e.g. fd98::2). Inside the VM, the guest cluster's OVN intercepts the response because it also owns fd98::/64, black-holing the packet.

This commit fixes the issue in two ways:

  1. Automatic KubeVirt default: for KubeVirt hosted clusters with OVNKubernetes, the reconciler now sets IPv6.InternalJoinSubnet to fd99::/64 by default, avoiding the collision with the management cluster's fd98::/64. This mirrors the existing V4InternalSubnet override (100.66.0.0/16) already in place for IPv4.

  2. User-facing API: adds OVNIPv6Config type to OVNKubernetesConfig, allowing explicit configuration of IPv6 internalJoinSubnet and internalTransitSwitchSubnet for any platform. This maps to the upstream operatorv1.IPv6OVNKubernetesConfig and includes IPv6 CIDR format validation via CEL rules.

Also extends CIDR overlap validation in the HostedCluster webhook to cover IPv6 OVN subnets, and adds envtest CRD validation cases.

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes https://redhat.atlassian.net/browse/OCPBUGS-84303

Special notes for your reviewer:

Checklist:

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

Summary by CodeRabbit

Release Notes

  • New Features
  • OVN-Kubernetes now supports IPv6 configuration, enabling users to specify IPv6 internal subnets for transit switches and join networks.
  • Automatic IPv6 subnet defaults applied for KubeVirt-hosted clusters using OVN.
  • Enhanced network validation includes IPv6 subnet overlap detection.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@orenc1 orenc1 force-pushed the fix_dual_stack__ovn_joinsubnet_collision branch from 43beb31 to b261379 Compare May 5, 2026 12:26
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@hypershift-operator/controllers/hostedcluster/hostedcluster_controller.go`:
- Around line 4338-4357: The code only appends user-provided OVNKubernetes IPv6
CIDRs to cidrEntries; change it to also include the effective KubeVirt default
join subnet (fd99::/64) when the OVNKubernetesConfig IPv6 is nil or when
IPv6.InternalJoinSubnet is empty: inside the block that checks
hc.Spec.Networking.NetworkType == hyperv1.OVNKubernetes (and related
OperatorConfiguration presence), detect when ovnIPv6Config is nil or
ovnIPv6Config.InternalJoinSubnet == "" and create a cidrEntry for fd99::/64
(using the same cidrEntry type and field.NewPath pointing at
"spec","operatorConfiguration","clusterNetworkOperator","ovnKubernetesConfig","ipv6","internalJoinSubnet")
and append it to cidrEntries so validateNetworks sees the effective default;
keep existing user-specified parsing logic for non-empty InternalJoinSubnet and
InternalTransitSwitchSubnet as-is.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 5b9e4b77-5cab-4687-becb-c627b18e3dad

📥 Commits

Reviewing files that changed from the base of the PR and between 6b39d47 and 43beb31.

⛔ Files ignored due to path filters (40)
  • api/hypershift/v1beta1/zz_generated.deepcopy.go is excluded by !**/zz_generated*.go, !**/zz_generated*
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/AAA_ungated.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ClusterUpdateAcceptRisks.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ClusterVersionOperatorConfiguration.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ExternalOIDC.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ExternalOIDCWithUIDAndExtraClaimMappings.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ExternalOIDCWithUpstreamParity.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/GCPPlatform.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/HCPEtcdBackup.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/HyperShiftOnlyDynamicResourceAllocation.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ImageStreamImportMode.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/KMSEncryptionProvider.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/OpenStack.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/TLSAdherence.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/AAA_ungated.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ClusterUpdateAcceptRisks.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ClusterVersionOperatorConfiguration.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ExternalOIDC.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ExternalOIDCWithUIDAndExtraClaimMappings.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ExternalOIDCWithUpstreamParity.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/GCPPlatform.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/HCPEtcdBackup.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/HyperShiftOnlyDynamicResourceAllocation.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ImageStreamImportMode.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/KMSEncryptionProvider.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/OpenStack.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/TLSAdherence.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • client/applyconfiguration/hypershift/v1beta1/ovnipv6config.go is excluded by !client/**
  • client/applyconfiguration/hypershift/v1beta1/ovnkubernetesconfig.go is excluded by !client/**
  • client/applyconfiguration/utils.go is excluded by !client/**
  • cmd/install/assets/crds/hypershift-operator/tests/hostedclusters.hypershift.openshift.io/stable.hostedclusters.networking.testsuite.yaml is excluded by !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedclusters-Hypershift-CustomNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedclusters-Hypershift-Default.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedclusters-Hypershift-TechPreviewNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedcontrolplanes-Hypershift-CustomNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedcontrolplanes-Hypershift-Default.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedcontrolplanes-Hypershift-TechPreviewNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • docs/content/reference/api.md is excluded by !docs/content/reference/api.md
  • vendor/github.com/openshift/hypershift/api/hypershift/v1beta1/operator.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/openshift/hypershift/api/hypershift/v1beta1/zz_generated.deepcopy.go is excluded by !vendor/**, !**/vendor/**, !**/zz_generated*.go, !**/zz_generated*
📒 Files selected for processing (5)
  • api/hypershift/v1beta1/operator.go
  • control-plane-operator/hostedclusterconfigoperator/controllers/resources/network/reconcile.go
  • control-plane-operator/hostedclusterconfigoperator/controllers/resources/network/reconcile_test.go
  • hypershift-operator/controllers/hostedcluster/hostedcluster_controller.go
  • hypershift-operator/controllers/hostedcluster/hostedcluster_controller_test.go

Comment thread hypershift-operator/controllers/hostedcluster/hostedcluster_controller.go Outdated
@openshift-ci-robot
Copy link
Copy Markdown

@orenc1: This pull request references Jira Issue OCPBUGS-84303, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (5.0.0) matches configured target version for branch (5.0.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)
Details

In response to this:

When a KubeVirt hosted cluster and its management cluster both use OVN-Kubernetes with dual-stack networking, they each default to fd98::/64 for the IPv6 join switch subnet. External IPv6 LoadBalancer traffic targeting VM pods is SNAT'd to the management cluster's join IP (e.g. fd98::2). Inside the VM, the guest cluster's OVN intercepts the response because it also owns fd98::/64, black-holing the packet.

This commit fixes the issue in two ways:

  1. Automatic KubeVirt default: for KubeVirt hosted clusters with OVNKubernetes, the reconciler now sets IPv6.InternalJoinSubnet to fd99::/64 by default, avoiding the collision with the management cluster's fd98::/64. This mirrors the existing V4InternalSubnet override (100.66.0.0/16) already in place for IPv4.

  2. User-facing API: adds OVNIPv6Config type to OVNKubernetesConfig, allowing explicit configuration of IPv6 internalJoinSubnet and internalTransitSwitchSubnet for any platform. This maps to the upstream operatorv1.IPv6OVNKubernetesConfig and includes IPv6 CIDR format validation via CEL rules.

Also extends CIDR overlap validation in the HostedCluster webhook to cover IPv6 OVN subnets, and adds envtest CRD validation cases.

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes https://redhat.atlassian.net/browse/OCPBUGS-84303

Special notes for your reviewer:

Checklist:

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

Summary by CodeRabbit

  • New Features

  • OVN-Kubernetes now supports IPv6 configuration for internal transit and join subnets.

  • Automatic IPv6 subnet defaults applied for KubeVirt-hosted clusters using OVN.

  • Bug Fixes / Validation

  • Network validation now detects IPv6 subnet overlaps and validates IPv6 OVN subnets.

  • Tests

  • Expanded test coverage for IPv6 OVN scenarios and mixed IPv4+IPv6 configurations.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 5, 2026

Codecov Report

❌ Patch coverage is 76.56250% with 15 lines in your changes missing coverage. Please review.
✅ Project coverage is 40.12%. Comparing base (5a61e14) to head (5c1ee1f).
⚠️ Report is 95 commits behind head on main.

Files with missing lines Patch % Lines
...rconfigoperator/controllers/resources/resources.go 20.00% 9 Missing and 3 partials ⚠️
...perator/controllers/resources/network/reconcile.go 83.33% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #8421      +/-   ##
==========================================
+ Coverage   40.00%   40.12%   +0.12%     
==========================================
  Files         751      753       +2     
  Lines       92838    93050     +212     
==========================================
+ Hits        37137    37338     +201     
+ Misses      53014    53012       -2     
- Partials     2687     2700      +13     
Files with missing lines Coverage Δ
...trollers/hostedcluster/hostedcluster_controller.go 44.16% <100.00%> (+0.51%) ⬆️
...perator/controllers/resources/network/reconcile.go 59.32% <83.33%> (+3.87%) ⬆️
...rconfigoperator/controllers/resources/resources.go 55.17% <20.00%> (-0.20%) ⬇️

... and 6 files with indirect coverage changes

Flag Coverage Δ
cmd-support 34.28% <ø> (+0.19%) ⬆️
cpo-hostedcontrolplane 40.57% <ø> (+0.01%) ⬆️
cpo-other 40.17% <54.54%> (+0.02%) ⬆️
hypershift-operator 50.68% <100.00%> (+0.14%) ⬆️
other 31.54% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@orenc1 orenc1 force-pushed the fix_dual_stack__ovn_joinsubnet_collision branch from b261379 to 8d5be28 Compare May 5, 2026 14:14
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
control-plane-operator/hostedclusterconfigoperator/controllers/resources/network/reconcile.go (1)

93-122: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Reconcile deleted IPv6 overrides, not just additions.

This path only copies non-empty source fields. If a user later removes ipv6.internalJoinSubnet, ipv6.internalTransitSwitchSubnet, or the whole ovnKubernetesConfig, the previous values stay on the operatorv1.Network object, so the cluster cannot roll back to the KubeVirt default or to the platform default behavior.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@control-plane-operator/hostedclusterconfigoperator/controllers/resources/network/reconcile.go`
around lines 93 - 122, When reconciling OVNKubernetes config you must also
remove overrides when the source is deleted: update the block handling
networkType == hyperv1.OVNKubernetes so that if ovnConfig is nil you clear
network.Spec.DefaultNetwork.OVNKubernetesConfig (set to nil); if ovnConfig !=
nil but ovnConfig.IPv6 is nil then set ovnCfg.IPv6 = nil; and if ovnConfig.IPv6
exists then copy IPv6.InternalJoinSubnet and IPv6.InternalTransitSwitchSubnet
when non-empty but explicitly clear those ovnCfg.IPv6 fields (or set ovnCfg.IPv6
= nil if both are empty) when the source strings are empty—this ensures previous
values are removed; use the existing symbols ovnConfig, ovnCfg,
network.Spec.DefaultNetwork.OVNKubernetesConfig, and the IPv6.InternalJoinSubnet
/ IPv6.InternalTransitSwitchSubnet fields to locate and implement the changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In
`@control-plane-operator/hostedclusterconfigoperator/controllers/resources/network/reconcile.go`:
- Around line 93-122: When reconciling OVNKubernetes config you must also remove
overrides when the source is deleted: update the block handling networkType ==
hyperv1.OVNKubernetes so that if ovnConfig is nil you clear
network.Spec.DefaultNetwork.OVNKubernetesConfig (set to nil); if ovnConfig !=
nil but ovnConfig.IPv6 is nil then set ovnCfg.IPv6 = nil; and if ovnConfig.IPv6
exists then copy IPv6.InternalJoinSubnet and IPv6.InternalTransitSwitchSubnet
when non-empty but explicitly clear those ovnCfg.IPv6 fields (or set ovnCfg.IPv6
= nil if both are empty) when the source strings are empty—this ensures previous
values are removed; use the existing symbols ovnConfig, ovnCfg,
network.Spec.DefaultNetwork.OVNKubernetesConfig, and the IPv6.InternalJoinSubnet
/ IPv6.InternalTransitSwitchSubnet fields to locate and implement the changes.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: a71e1352-933e-4833-aab8-03e61a0455f6

📥 Commits

Reviewing files that changed from the base of the PR and between b261379 and 8d5be28.

⛔ Files ignored due to path filters (41)
  • api/hypershift/v1beta1/zz_generated.deepcopy.go is excluded by !**/zz_generated*.go, !**/zz_generated*
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/AAA_ungated.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ClusterUpdateAcceptRisks.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ClusterVersionOperatorConfiguration.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ExternalOIDC.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ExternalOIDCWithUIDAndExtraClaimMappings.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ExternalOIDCWithUpstreamParity.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/GCPPlatform.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/HCPEtcdBackup.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/HyperShiftOnlyDynamicResourceAllocation.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ImageStreamImportMode.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/KMSEncryptionProvider.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/OpenStack.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/TLSAdherence.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/AAA_ungated.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ClusterUpdateAcceptRisks.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ClusterVersionOperatorConfiguration.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ExternalOIDC.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ExternalOIDCWithUIDAndExtraClaimMappings.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ExternalOIDCWithUpstreamParity.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/GCPPlatform.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/HCPEtcdBackup.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/HyperShiftOnlyDynamicResourceAllocation.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ImageStreamImportMode.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/KMSEncryptionProvider.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/OpenStack.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/TLSAdherence.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • client/applyconfiguration/hypershift/v1beta1/ovnipv6config.go is excluded by !client/**
  • client/applyconfiguration/hypershift/v1beta1/ovnkubernetesconfig.go is excluded by !client/**
  • client/applyconfiguration/utils.go is excluded by !client/**
  • cmd/install/assets/crds/hypershift-operator/tests/hostedclusters.hypershift.openshift.io/stable.hostedclusters.networking.testsuite.yaml is excluded by !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedclusters-Hypershift-CustomNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedclusters-Hypershift-Default.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedclusters-Hypershift-TechPreviewNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedcontrolplanes-Hypershift-CustomNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedcontrolplanes-Hypershift-Default.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedcontrolplanes-Hypershift-TechPreviewNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • docs/content/reference/aggregated-docs.md is excluded by !docs/content/reference/aggregated-docs.md
  • docs/content/reference/api.md is excluded by !docs/content/reference/api.md
  • vendor/github.com/openshift/hypershift/api/hypershift/v1beta1/operator.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/openshift/hypershift/api/hypershift/v1beta1/zz_generated.deepcopy.go is excluded by !vendor/**, !**/vendor/**, !**/zz_generated*.go, !**/zz_generated*
📒 Files selected for processing (5)
  • api/hypershift/v1beta1/operator.go
  • control-plane-operator/hostedclusterconfigoperator/controllers/resources/network/reconcile.go
  • control-plane-operator/hostedclusterconfigoperator/controllers/resources/network/reconcile_test.go
  • hypershift-operator/controllers/hostedcluster/hostedcluster_controller.go
  • hypershift-operator/controllers/hostedcluster/hostedcluster_controller_test.go
✅ Files skipped from review due to trivial changes (1)
  • control-plane-operator/hostedclusterconfigoperator/controllers/resources/network/reconcile_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • hypershift-operator/controllers/hostedcluster/hostedcluster_controller.go

@orenc1
Copy link
Copy Markdown
Contributor Author

orenc1 commented May 5, 2026

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)

🤖 Prompt for all review comments with AI agents

ℹ️ Review info

This is a pre-existing pattern — the IPv4 block uses the exact same additive-only approach (only copies non-empty values, never clears). The MTU field follows the same convention as well. Changing this behavior would alter semantics for all OVN config fields (IPv4, IPv6, MTU), which is out of scope for this bug fix.

Additionally, OVN internal join/transit subnets are effectively immutable after cluster creation — changing them at runtime would break OVN networking. The additive-only reconciliation pattern is appropriate here.

orenc1 added a commit to orenc1/hypershift that referenced this pull request May 7, 2026
…ual-stack routing collision

Cherry-pick of openshift#8421 to release-4.21.

On KubeVirt dual-stack hosted clusters, the guest OVN-Kubernetes cluster
shares the same default IPv6 join subnet (fd98::/64) as the management
cluster. When external IPv6 LoadBalancer traffic is SNAT'd to a join
switch IP, the guest cluster intercepts the response because both
clusters own the same fd98::/64 range, causing a routing black hole.

This fix:
- Defaults the guest cluster's IPv6 OVN join subnet to fd99::/64 for
  KubeVirt hosted clusters, avoiding the collision automatically
- Adds OVNIPv6Config API type allowing users to explicitly configure
  IPv6 internalJoinSubnet and internalTransitSwitchSubnet
- Extends CIDR overlap validation to cover IPv6 OVN subnets including
  the implicit KubeVirt default (fd99::/64)
- Adds unit tests for all new IPv6 validation and reconciliation logic

Signed-off-by: Oren Cohen <ocohen@redhat.com>
Assisted-by: Claude Opus 4 (via Cursor)
Co-authored-by: Cursor <cursoragent@cursor.com>
@orenc1
Copy link
Copy Markdown
Contributor Author

orenc1 commented May 11, 2026

The fix is verified. with a different, non-conflicting OVN join IPv6 subnet for the hosted cluster, LB service is accessible:

Verification: IPv6 LoadBalancer Fix (OVN Join Subnet Collision)

Environment

  • Management cluster: 3-node bare metal (cnvqe-064/065/066), dual-stack
  • Hosted cluster: dualstack-fix, 3 KubeVirt worker nodes, dual-stack (OCP built from PR payload)
  • OVN IPv6 join subnet (guest): fd99::/64 ✅ (previously collided at fd98::/64)

Setup

  1. Created a hello-openshift deployment in the hosted cluster (lb-test namespace)
  2. Exposed it as a dual-stack LoadBalancer service (RequireDualStack)
  3. KubeVirt cloud provider created a proxy service on the management cluster
  4. MetalLB assigned:
    • IPv4 VIP: 10.46.255.14
    • IPv6 VIP: fc00:f853:ccd:e799::4

Results

Source Target Result
Management cluster pod → IPv4 LB (10.46.255.14:8080) Hosted cluster service PASSHello OpenShift!
Management cluster pod → IPv6 LB ([fc00:f853:ccd:e799::4]:8080) Hosted cluster service PASSHello OpenShift! (3/3 runs)
Management cluster node → IPv6 LB Hosted cluster service PASSHello OpenShift!

Key Confirmation

The guest cluster's OVN network operator config shows the fix is active:

{
  "ipv6": {
    "internalJoinSubnet": "fd99::/64"
  }
}

/verified by @orenc1

@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label May 11, 2026
@openshift-ci-robot
Copy link
Copy Markdown

@orenc1: This PR has been marked as verified by @orenc1.

Details

In response to this:

The fix is verified. with a different, non-conflicting OVN join IPv6 subnet for the hosted cluster, LB service is accessible:

Verification: IPv6 LoadBalancer Fix (OVN Join Subnet Collision)

Environment

  • Management cluster: 3-node bare metal (cnvqe-064/065/066), dual-stack
  • Hosted cluster: dualstack-fix, 3 KubeVirt worker nodes, dual-stack (OCP built from PR payload)
  • OVN IPv6 join subnet (guest): fd99::/64 ✅ (previously collided at fd98::/64)

Setup

  1. Created a hello-openshift deployment in the hosted cluster (lb-test namespace)
  2. Exposed it as a dual-stack LoadBalancer service (RequireDualStack)
  3. KubeVirt cloud provider created a proxy service on the management cluster
  4. MetalLB assigned:
  • IPv4 VIP: 10.46.255.14
  • IPv6 VIP: fc00:f853:ccd:e799::4

Results

Source Target Result
Management cluster pod → IPv4 LB (10.46.255.14:8080) Hosted cluster service PASSHello OpenShift!
Management cluster pod → IPv6 LB ([fc00:f853:ccd:e799::4]:8080) Hosted cluster service PASSHello OpenShift! (3/3 runs)
Management cluster node → IPv6 LB Hosted cluster service PASSHello OpenShift!

Key Confirmation

The guest cluster's OVN network operator config shows the fix is active:

{
 "ipv6": {
   "internalJoinSubnet": "fd99::/64"
 }
}
/verified by @orenc1 

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@orenc1 orenc1 force-pushed the fix_dual_stack__ovn_joinsubnet_collision branch from b9610c9 to 71baf0a Compare May 13, 2026 09:34
@github-actions github-actions Bot temporarily deployed to docs-preview/pr-8421 May 13, 2026 09:38 Inactive
@orenc1 orenc1 force-pushed the fix_dual_stack__ovn_joinsubnet_collision branch from 71baf0a to 4545f1e Compare May 13, 2026 12:23
@github-actions github-actions Bot temporarily deployed to docs-preview/pr-8421 May 13, 2026 12:26 Inactive
orenc1 added a commit to orenc1/hypershift that referenced this pull request May 13, 2026
…ual-stack routing collision

When a KubeVirt hosted cluster and its management cluster both use
OVN-Kubernetes with dual-stack networking, they each default to
fd98::/64 for the IPv6 join switch subnet. External IPv6 LoadBalancer
traffic targeting VM pods is SNAT'd to the management cluster's join IP
(e.g. fd98::2). Inside the VM, the guest cluster's OVN intercepts the
response because it also owns fd98::/64, black-holing the packet.

This commit fixes the issue in two ways:

1. Automatic KubeVirt default: for KubeVirt hosted clusters with
   OVNKubernetes and IPv6 networks, the reconciler now sets
   IPv6.InternalJoinSubnet to fd99::/64 by default, avoiding the
   collision with the management cluster's fd98::/64. This mirrors the
   existing V4InternalSubnet override (100.66.0.0/16) already in place
   for IPv4.

2. User-facing API: adds OVNIPv6Config type to OVNKubernetesConfig,
   allowing explicit configuration of IPv6 internalJoinSubnet and
   internalTransitSwitchSubnet for any platform. This maps to the
   upstream operatorv1.IPv6OVNKubernetesConfig and includes IPv6 CIDR
   format validation via CEL isCIDR() rules.

Also extends CIDR overlap validation in the HostedCluster webhook to
cover IPv6 OVN subnets and the KubeVirt IPv4 default (100.66.0.0/16).

Backport of openshift#8421

Signed-off-by: Oren Cohen <ocohen@redhat.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Copy link
Copy Markdown
Contributor

@everettraven everettraven left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

API structure looks good to me overall. Just a handful of comments.

Comment thread api/hypershift/v1beta1/operator.go
Comment on lines +170 to +171
// +kubebuilder:validation:XValidation:rule="isCIDR(self) && cidr(self).ip().family() == 6", message="Subnet must be in valid IPv6 CIDR format (e.g., fd97::/64)"
// +kubebuilder:validation:XValidation:rule="isCIDR(self) && cidr(self).prefixLength() <= 125", message="subnet must be in the range /0 to /125 inclusive"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General note for any other reviewers - the CIDR CEL library was introduced in Kubernetes 1.30. From what I recall being told, we are look for N-4 compatibility from the kube version skew, so with 4.22 being based on 1.35 that means the oldest kube version we would end up running on would be 1.31. This should be OK.

Comment thread api/hypershift/v1beta1/operator.go
// +kubebuilder:validation:XValidation:rule="self.matches('^\\\\s*((([0-9A-Fa-f]{1,4}:){7}([0-9A-Fa-f]{1,4}|:))|(([0-9A-Fa-f]{1,4}:){6}(:[0-9A-Fa-f]{1,4}|:))|(([0-9A-Fa-f]{1,4}:){5}((:[0-9A-Fa-f]{1,4}){1,2}|:))|(([0-9A-Fa-f]{1,4}:){4}((:[0-9A-Fa-f]{1,4}){1,3}|:))|(([0-9A-Fa-f]{1,4}:){3}((:[0-9A-Fa-f]{1,4}){1,4}|:))|(([0-9A-Fa-f]{1,4}:){2}((:[0-9A-Fa-f]{1,4}){1,5}|:))|(([0-9A-Fa-f]{1,4}:){1}((:[0-9A-Fa-f]{1,4}){1,6}|:))|(::((:[0-9A-Fa-f]{1,4}){1,7}|:)))\\\\s*/([0-9]|[1-9][0-9]|1[0-1][0-9]|12[0-8])$')", message="Subnet must be in valid IPv6 CIDR format (e.g., fd98::/64)"
// +kubebuilder:validation:XValidation:rule="self.matches('^.*/[0-9]+$') && int(self.split('/')[1]) <= 125", message="subnet must be in the range /0 to /125 inclusive"
// +optional
InternalJoinSubnet string `json:"internalJoinSubnet,omitempty"`
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we know ahead of time that making a change to the networking configuration on a running cluster would be bad, let's go ahead and prevent it where we can now.

Just a note that immutability rules must be done on the closest parent field that is required, otherwise the field isn't actually immutable because it can be removed by removing an optional parent (or the field itself) in the chain and re-applying with a different value.

route: {}
expectedError: "ipv6 internalJoinSubnet and internalTransitSwitchSubnet must not be the same"

- name: When ovnKubernetesConfig ipv6 internalJoinSubnet has invalid CIDR it should fail
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replicate this test for internalTrasitSwitchSubnet?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

envtest case has been added, expecting valid IPv6 CIDR format

route: {}
expectedError: "Subnet must be in valid IPv6 CIDR format"

- name: When ovnKubernetesConfig ipv6 internalJoinSubnet has prefix length greater than 125 it should fail
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replicate this test for internalTrasitSwitchSubnet?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

envtest case has been added, expecting range of /0 to /125 inclusive

@orenc1 orenc1 force-pushed the fix_dual_stack__ovn_joinsubnet_collision branch from 4545f1e to befec34 Compare May 14, 2026 08:26
@github-actions github-actions Bot temporarily deployed to docs-preview/pr-8421 May 14, 2026 08:33 Inactive
@orenc1 orenc1 force-pushed the fix_dual_stack__ovn_joinsubnet_collision branch from befec34 to df51feb Compare May 14, 2026 09:33
@github-actions github-actions Bot temporarily deployed to docs-preview/pr-8421 May 14, 2026 09:40 Inactive
Comment on lines +79 to +83
// +kubebuilder:validation:XValidation:rule="!has(self.ipv6) || !has(self.ipv6.internalJoinSubnet) || !has(self.ipv6.internalTransitSwitchSubnet) || self.ipv6.internalJoinSubnet != self.ipv6.internalTransitSwitchSubnet", message="ipv6 internalJoinSubnet and internalTransitSwitchSubnet must not be the same"
// +kubebuilder:validation:XValidation:rule="!has(oldSelf.mtu) || has(self.mtu)",message="mtu is immutable once set and cannot be removed"
// +kubebuilder:validation:XValidation:rule="!has(oldSelf.ipv6) || has(self.ipv6)", message="ipv6 is immutable once set and cannot be removed"
// +kubebuilder:validation:XValidation:rule="!has(oldSelf.ipv6) || !has(oldSelf.ipv6.internalJoinSubnet) || (has(self.ipv6) && has(self.ipv6.internalJoinSubnet))", message="ipv6.internalJoinSubnet cannot be removed once set"
// +kubebuilder:validation:XValidation:rule="!has(oldSelf.ipv6) || !has(oldSelf.ipv6.internalTransitSwitchSubnet) || (has(self.ipv6) && has(self.ipv6.internalTransitSwitchSubnet))", message="ipv6.internalTransitSwitchSubnet cannot be removed once set"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For these immutability rules, what happens if the parent ovnKubernetesConfig field is set, removed, and re-applied with new values?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're right, good catch. if we remove the entire ovnKubernetesConfig field, then re-apply with different values, the immutability rules could be bypassed.
to mitigate that, i'm adding a validation rule of "cannot be removed once set" on ClusterNetworkOperatorSpec for ovnKubernetesConfig.

Copy link
Copy Markdown
Contributor

@everettraven everettraven May 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It still looks like you could just remove the ClusterNetworkOperatorSpec field based on:

// clusterNetworkOperator specifies the configuration for the Cluster Network Operator in the hosted cluster.
//
// +optional
ClusterNetworkOperator *ClusterNetworkOperatorSpec `json:"clusterNetworkOperator,omitempty"`

If you want true immutability you need to set the rule all the way up to the first required field, which looks like it means on the root HostedCluster type because even spec is optional:

// spec is the desired behavior of the HostedCluster.
// +optional
Spec HostedClusterSpec `json:"spec,omitempty"`

I'd make sure you pay particular attention to how immutability applies as you go up the chain of fields to make sure you don't make something that is OK to mutate immutable.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. how do you suggest to proceed? should I modify the validation rules of the higher fields in the chain, up to the root? I feel it's beyond the scope of this PR.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong preference. If this significantly increases the scope of this work to achieve this, I'm OK with best effort immutability for now with a more concrete immutability story for the near-ish future.

@orenc1 orenc1 force-pushed the fix_dual_stack__ovn_joinsubnet_collision branch from df51feb to a824440 Compare May 17, 2026 07:17
@github-actions github-actions Bot temporarily deployed to docs-preview/pr-8421 May 17, 2026 07:23 Inactive
…ual-stack routing collision

When a KubeVirt hosted cluster and its management cluster both use
OVN-Kubernetes with dual-stack networking, they each default to
fd98::/64 for the IPv6 join switch subnet. External IPv6 LoadBalancer
traffic targeting VM pods is SNAT'd to the management cluster's join IP
(e.g. fd98::2). Inside the VM, the guest cluster's OVN intercepts the
response because it also owns fd98::/64, black-holing the packet.

This commit fixes the issue in two ways:

1. Automatic KubeVirt default: for KubeVirt hosted clusters with
   OVNKubernetes, the reconciler now sets IPv6.InternalJoinSubnet to
   fd99::/64 by default, avoiding the collision with the management
   cluster's fd98::/64. This mirrors the existing V4InternalSubnet
   override (100.66.0.0/16) already in place for IPv4.

2. User-facing API: adds OVNIPv6Config type to OVNKubernetesConfig,
   allowing explicit configuration of IPv6 internalJoinSubnet and
   internalTransitSwitchSubnet for any platform. This maps to the
   upstream operatorv1.IPv6OVNKubernetesConfig and includes IPv6 CIDR
   format validation via CEL rules.

Also extends CIDR overlap validation in the HostedCluster webhook to
cover IPv6 OVN subnets, and adds envtest CRD validation cases.

Fixes: https://redhat.atlassian.net/browse/OCPBUGS-84303

Signed-off-by: Oren Cohen <ocohen@redhat.com>
Assisted-by: Claude <noreply@anthropic.com>
@orenc1 orenc1 force-pushed the fix_dual_stack__ovn_joinsubnet_collision branch from a824440 to 5c1ee1f Compare May 17, 2026 07:36
@jparrill
Copy link
Copy Markdown
Contributor

/approve

Copy link
Copy Markdown
Contributor

@everettraven everettraven left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside from the immutability rules being able to be circumvented, this looks good to me.

Not blocking on concrete immutability rules to prevent unnecessary scope increase for this PR. It has best-effort immutability rules in place which is good enough for me for now.

/approve

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label May 21, 2026
@openshift-merge-bot
Copy link
Copy Markdown
Contributor

Scheduling tests matching the pipeline_run_if_changed or not excluded by pipeline_skip_if_only_changed parameters:
/test e2e-aks-4-22
/test e2e-aws-4-22
/test e2e-aks
/test e2e-aws
/test e2e-aws-upgrade-hypershift-operator
/test e2e-azure-self-managed
/test e2e-kubevirt-aws-ovn-reduced
/test e2e-v2-aws
/test e2e-v2-gke

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 21, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: everettraven, jparrill, orenc1

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 21, 2026
@hypershift-jira-solve-ci
Copy link
Copy Markdown

AI Test Failure Analysis

Job: pull-ci-openshift-hypershift-main-e2e-aws | Build: 2057445780108611584 | Cost: $4.10208675 | Failed step: hypershift-aws-run-e2e-nested

View full analysis report


Generated by hypershift-analyze-e2e-failure post-step using Claude claude-opus-4-6

@orenc1
Copy link
Copy Markdown
Contributor Author

orenc1 commented May 25, 2026

/retest

@orenc1
Copy link
Copy Markdown
Contributor Author

orenc1 commented May 25, 2026

/retest

@hypershift-jira-solve-ci
Copy link
Copy Markdown

The background task completed — it was the earlier crane ls command for listing image tags, which is no longer needed since I already completed the analysis. The final report has been delivered above with the full root cause analysis.


@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 25, 2026

@orenc1: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/api Indicates the PR includes changes for the API area/cli Indicates the PR includes changes for CLI area/control-plane-operator Indicates the PR includes changes for the control plane operator - in an OCP release area/documentation Indicates the PR includes changes for documentation area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release jira/severity-important Referenced Jira bug's severity is important for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants