Skip to content

OCPBUGS-85585: tighten registry override matching to strict longest-prefix across release-image consumers#8509

Merged
openshift-merge-bot[bot] merged 6 commits into
openshift:mainfrom
raelga:raelga/fix-cpo-registry-overrides-init-containers
Jun 22, 2026
Merged

OCPBUGS-85585: tighten registry override matching to strict longest-prefix across release-image consumers#8509
openshift-merge-bot[bot] merged 6 commits into
openshift:mainfrom
raelga:raelga/fix-cpo-registry-overrides-init-containers

Conversation

@raelga

@raelga raelga commented May 13, 2026

Copy link
Copy Markdown
Contributor

Summary

The RegistryMirrorProviderDecorator used by Lookup() rewrites every
ImageStream tag with a bare strings.Replace(image, source, target, 1). This
matches anywhere in the image reference, so:

  • it can incorrectly rewrite substrings inside hostnames (e.g. quay.io
    inside quay.io.example.com),
  • it does not reliably rewrite all component images, leading to CPO
    sub-resources (e.g. the availability-prober init container injected into
    capi-provider and other deployments) still pointing at the original
    registry.

Downstream, this breaks clusters where a ValidatingAdmissionPolicy in Deny
mode only allows images from approved registries: HCP creation hangs
indefinitely.

Root cause

support/releaseinfo/registry_mirror_provider.go ran an unbounded
strings.Replace per (source, target) pair across the entire image string.
Match boundaries were neither anchored to the start of the reference nor to a
path separator, and iteration order over the override map was undefined, so
which override "won" depended on Go's randomized map iteration.

Fix

Extract a shared registryoverride.Replace(image, overrides) helper at
support/util/registryoverride/ and use it from both:

  1. RegistryMirrorProviderDecorator.Lookup() (the actual bug fix), and
  2. imageprovider.NewWithRegistryOverrides() (defense-in-depth: callers that
    construct a provider directly still get the same semantics).

The helper:

  • matches strictly: image == source or image has prefix source + "/"
    (no substring/false-host matches),
  • picks the longest matching source prefix (deterministic across map
    iteration order),
  • is a no-op when overrides are empty or no source matches.

imageprovider.New(releaseImage) now delegates to
NewWithRegistryOverrides(releaseImage, nil), and the controller wiring at
hostedcontrolplane_controller.go:1106 keeps the explicit
NewWithRegistryOverrides(..., r.ReleaseProvider.GetRegistryOverrides())
call so the override path is exercised even when callers don't go through
Lookup() first.

Impact

  • Component images in CPO sub-resources (init containers in capi-provider,
    ignition server, etc.) now reliably honor --registry-overrides.
  • VAPs in Deny mode no longer block HCP creation in air-gapped / mirrored
    environments.
  • Side effect: false-positive substring rewrites in the decorator (a
    pre-existing latent bug) are also fixed.

Test plan

  • Unit tests for registryoverride.Replace covering 12 cases incl. boundary,
    empty overrides, longest-prefix, no mutation of the overrides map.
  • imageprovider_test.go covers no-overrides, no-match, single-match,
    multi-override, longest-prefix-wins, mutation safety (release-image and
    overrides untouched), and idempotency (no double-apply when image was
    already mirrored).
  • hostedcontrolplane_controller_test.go (TestEventHandling,
    TestNonReadyInfraTriggersRequeueAfter) now exercise the override path
    with non-empty overrides through full reconcile.
  • E2E with --registry-overrides + VAP in Deny mode (manual).

Related

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: LGTM mode

@coderabbitai

coderabbitai Bot commented May 13, 2026

Copy link
Copy Markdown
Contributor

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

The PR adds a deterministic registry-prefix remapping utility and applies it when building release image providers. imageprovider.New delegates to NewWithRegistryOverrides(releaseImage, registryOverrides) which clones and rewrites component image references using registryoverride.Replace. The registry mirror decorator was updated to use the new helper. Unit tests cover Replace and NewWithRegistryOverrides behaviors and idempotency. Two controller reconciliation tests were updated to mock non-empty registry overrides returned by GetRegistryOverrides().

Sequence Diagram(s)

sequenceDiagram
    participant C as Controller
    participant RP as ReleaseProvider
    participant IP as ImageProvider
    participant K as Kubernetes API

    C->>RP: GetReleaseImage()
    C->>RP: GetRegistryOverrides()
    RP-->>C: releaseImage, registryOverrides
    C->>IP: NewWithRegistryOverrides(releaseImage, registryOverrides)
    IP-->>C: SimpleReleaseImageProvider (componentsImages remapped)
    C->>K: Reconcile components using provider (image references)
    K-->>C: Reconciliation status
Loading
🚥 Pre-merge checks | ✅ 11 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 16.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (11 passed)
Check name Status Explanation
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed No Ginkgo tests present in PR. All test files use standard Go testing framework. Custom check for Ginkgo test name stability is not applicable.
Test Structure And Quality ✅ Passed No Ginkgo tests found in this PR. All test code uses standard Go testing with *testing.T. Custom check requires Ginkgo test review, making it not applicable.
Microshift Test Compatibility ✅ Passed PR adds only standard Go unit tests, not Ginkgo e2e tests. No Ginkgo patterns detected. Custom check does not apply.
Single Node Openshift (Sno) Test Compatibility ✅ Passed No new Ginkgo e2e tests found in this PR. The changes include only standard Go unit tests and controller code updates. The check does not apply.
Topology-Aware Scheduling Compatibility ✅ Passed Changes add image registry override utilities. No scheduling constraints, pod affinity, node selectors, topology spread, replica logic, or topology-dependent scheduling is introduced.
Ote Binary Stdout Contract ✅ Passed No OTE Binary Stdout Contract violations detected. All modified/new files contain only utility functions and standard tests with no stdout writes at the process level.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed This PR contains only standard Go unit tests (func Test*), not Ginkgo e2e tests. The custom check applies only to Ginkgo e2e tests and is not applicable here.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: tightening registry override matching and applying it across release-image consumers. It aligns with the PR's core objective of fixing registry override propagation to component images.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci Bot requested review from devguyio and muraee May 13, 2026 22:04
@openshift-ci openshift-ci Bot added area/control-plane-operator Indicates the PR includes changes for the control plane operator - in an OCP release and removed do-not-merge/needs-area labels May 13, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@control-plane-operator/controllers/hostedcontrolplane/imageprovider/imageprovider_test.go`:
- Around line 200-277: Add a unit test covering the registry-prefix edge case
where an override key like "quay.io" should not match a hostname subdomain
"quay.io.example.com": add a new t.Run in imageprovider_test.go that constructs
a release image with a component image
"quay.io.example.com/namespace/image:tag", supplies an overrides map containing
"quay.io" -> "mirror.example.com", calls NewWithRegistryOverrides(releaseImage,
overrides) and asserts provider.GetImage("component") remains the original
"quay.io.example.com/namespace/image:tag". This ensures the matching logic used
by NewWithRegistryOverrides/GetImage treats registry keys as exact registry
hostnames (not substrings) and prevents incorrect remapping of subdomains.

In
`@control-plane-operator/controllers/hostedcontrolplane/imageprovider/imageprovider.go`:
- Line 70: The map returned by releaseImage.ComponentImages() (assigned to
images) is being modified directly, which can mutate the releaseImage internal
state; clone that map into a new map[string]string (copying all key/value pairs)
before performing any modifications in imageprovider.go (where images is
used/updated) so all changes occur on the cloned map and the original
releaseImage remains unmodified.
- Around line 74-76: The prefix check using strings.HasPrefix(image, source) can
match partial hostnames; update the condition in the loop that sets images[key]
so it only treats source as a match when it equals the registry component
exactly or is followed by a '/' (e.g. use a predicate like image == source or
strings.HasPrefix(image, source+"/") instead of plain HasPrefix). Modify the if
that currently reads strings.HasPrefix(image, source) to this stricter check
before performing images[key] = strings.Replace(image, source, target, 1) so
only full registry hosts are replaced.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: d098d99a-9102-4bfb-b08a-fda47dfc0431

📥 Commits

Reviewing files that changed from the base of the PR and between 674d92a and 196dae0.

📒 Files selected for processing (3)
  • control-plane-operator/controllers/hostedcontrolplane/hostedcontrolplane_controller.go
  • control-plane-operator/controllers/hostedcontrolplane/imageprovider/imageprovider.go
  • control-plane-operator/controllers/hostedcontrolplane/imageprovider/imageprovider_test.go

@raelga

raelga commented May 13, 2026

Copy link
Copy Markdown
Contributor Author

/area control-plane-operator

@codecov

codecov Bot commented May 13, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 93.33333% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 41.90%. Comparing base (dca6f75) to head (55b6fda).
⚠️ Report is 9 commits behind head on main.

Files with missing lines Patch % Lines
.../hostedcontrolplane/imageprovider/imageprovider.go 90.90% 1 Missing ⚠️
support/releaseinfo/registry_mirror_provider.go 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #8509      +/-   ##
==========================================
+ Coverage   41.88%   41.90%   +0.01%     
==========================================
  Files         759      760       +1     
  Lines       94155    94177      +22     
==========================================
+ Hits        39434    39461      +27     
+ Misses      51961    51956       -5     
  Partials     2760     2760              
Files with missing lines Coverage Δ
...ostedcontrolplane/hostedcontrolplane_controller.go 45.71% <100.00%> (ø)
support/util/registryoverride/registryoverride.go 100.00% <100.00%> (ø)
.../hostedcontrolplane/imageprovider/imageprovider.go 93.33% <90.90%> (+18.33%) ⬆️
support/releaseinfo/registry_mirror_provider.go 47.82% <0.00%> (+1.99%) ⬆️
Flag Coverage Δ
cmd-support 35.20% <94.44%> (+0.03%) ⬆️
cpo-hostedcontrolplane 44.21% <91.66%> (+0.04%) ⬆️
cpo-other 43.52% <ø> (ø)
hypershift-operator 52.05% <ø> (ø)
other 31.56% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
control-plane-operator/controllers/hostedcontrolplane/imageprovider/imageprovider_test.go (1)

200-294: ⚡ Quick win

Add test case for registry port number handling.

The test suite comprehensively covers main scenarios, including the subdomain non-match case. All tests follow best practices with proper "When ... should ..." naming, gomega assertions, and parallel execution.

One scenario not currently tested: registry hostnames with explicit port numbers. The implementation handles this correctly as long as the override key includes the port (e.g., "registry.example.com:5000": "mirror.example.com" will successfully match registry.example.com:5000/namespace/image). Consider adding a test to document this expected behavior:

t.Run("When registry has port number, override matching includes the port", func(t *testing.T) {
	t.Parallel()
	g := NewWithT(t)

	releaseImage := newTestReleaseImage(map[string]string{
		"component": "registry.example.com:5000/namespace/image:tag",
	})
	overrides := map[string]string{
		"registry.example.com:5000": "mirror.example.com",
	}

	provider := NewWithRegistryOverrides(releaseImage, overrides)

	g.Expect(provider.GetImage("component")).To(Equal(
		"mirror.example.com/namespace/image:tag"))
})
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@control-plane-operator/controllers/hostedcontrolplane/imageprovider/imageprovider_test.go`
around lines 200 - 294, Add a test case to TestNewWithRegistryOverrides that
verifies registry hostnames with ports are matched when the override key
includes the port: create a subtest using newTestReleaseImage with an image like
"registry.example.com:5000/namespace/image:tag", supply overrides map with key
"registry.example.com:5000" and value "mirror.example.com", construct provider
via NewWithRegistryOverrides and assert provider.GetImage("component") returns
"mirror.example.com/namespace/image:tag".
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In
`@control-plane-operator/controllers/hostedcontrolplane/imageprovider/imageprovider_test.go`:
- Around line 200-294: Add a test case to TestNewWithRegistryOverrides that
verifies registry hostnames with ports are matched when the override key
includes the port: create a subtest using newTestReleaseImage with an image like
"registry.example.com:5000/namespace/image:tag", supply overrides map with key
"registry.example.com:5000" and value "mirror.example.com", construct provider
via NewWithRegistryOverrides and assert provider.GetImage("component") returns
"mirror.example.com/namespace/image:tag".

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 451a24c9-3b71-43db-a6dc-02b6393dbd55

📥 Commits

Reviewing files that changed from the base of the PR and between 196dae0 and 8fd79a4.

📒 Files selected for processing (2)
  • control-plane-operator/controllers/hostedcontrolplane/imageprovider/imageprovider.go
  • control-plane-operator/controllers/hostedcontrolplane/imageprovider/imageprovider_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • control-plane-operator/controllers/hostedcontrolplane/imageprovider/imageprovider.go

@raelga raelga force-pushed the raelga/fix-cpo-registry-overrides-init-containers branch 2 times, most recently from 009bb28 to 6f542db Compare May 14, 2026 11:41
@jparrill

Copy link
Copy Markdown
Contributor

/retitle OCPBUGS-85585: apply registry overrides to component images in CPO sub-resources

@openshift-ci openshift-ci Bot changed the title fix: apply registry overrides to component images in CPO sub-resources (OCPBUGS-85585) OCPBUGS-85585: apply registry overrides to component images in CPO sub-resources May 14, 2026
@openshift-ci-robot openshift-ci-robot added jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels May 14, 2026
@openshift-ci-robot

Copy link
Copy Markdown

@raelga: This pull request references Jira Issue OCPBUGS-85585, which is invalid:

  • expected the bug to target the "5.0.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

Summary

CPO does not propagate --registry-overrides to init containers it injects in HCP sub-resources. When CPO creates deployments like capi-provider, it injects availability-prober init containers using the original quay.io image reference instead of the overridden ACR path.

Root cause

In hostedcontrolplane_controller.go:1106, the releaseImageProvider is created with imageprovider.New(releaseImage) which returns component images as-is from the release image stream. The registry overrides available via r.ReleaseProvider.GetRegistryOverrides() are not applied to these component images.

Fix

Add NewWithRegistryOverrides() to the imageprovider package that applies registry overrides to all component images when creating the provider. Use it in the HostedControlPlane reconciler instead of imageprovider.New().

This ensures when CPO calls GetImage("availability-prober") (or any other component), the returned image reference has the registry override applied.

Impact

Without this fix, any ValidatingAdmissionPolicy that restricts container images to approved registries will block CPO init containers, causing HCP cluster creation to hang indefinitely.

Test plan

  • 4 new unit tests covering: overrides match, no overrides, non-matching overrides, multiple overrides
  • Existing imageprovider tests pass
  • E2E with --registry-overrides and a VAP in Deny mode

Related

Summary by CodeRabbit

  • New Features

  • Registry overrides are now applied when reconciling hosted control plane components; image references are remapped using configured registry prefixes and only the first matching prefix is applied.

  • Remapping ignores non-matching prefixes and avoids accidental subdomain-only matches.

  • Tests

  • Added tests validating registry override behavior and edge cases.

  • Updated controller tests to reflect empty registry-overrides in mocked provider.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@jparrill

Copy link
Copy Markdown
Contributor

/jira refresh

@openshift-ci-robot openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels May 14, 2026
@openshift-ci-robot

Copy link
Copy Markdown

@jparrill: This pull request references Jira Issue OCPBUGS-85585, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (5.0.0) matches configured target version for branch (5.0.0)
  • bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)
Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@jparrill jparrill left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dropped some comments. Thanks!

raelga added a commit to raelga/hypershift that referenced this pull request May 18, 2026
…lper

Three small refactors that align this provider with the decorator change
in the previous commit and address review feedback on openshift#8509:

  * New() now delegates to NewWithRegistryOverrides(releaseImage, nil)
    instead of duplicating the field initialisation. registryoverride.Replace
    is a no-op for a nil override map, so callers of New() see exactly the
    same component images as before.
  * NewWithRegistryOverrides() uses maps.Clone (Go 1.25+, already required
    by go.mod) to copy ComponentImages() and registryoverride.Replace to
    remap each image. The inline duplicate-detection loop is gone.
  * The matching semantics now match RegistryMirrorProviderDecorator:
    strict slash-boundary prefix matching with longest-prefix-wins, instead
    of first-match-in-map-iteration-order.

As a small defensive bonus, New() now also returns a SimpleReleaseImageProvider
that owns a private copy of ComponentImages(), rather than aliasing the
map embedded in the ReleaseImage.

Related: OCPBUGS-85585

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
raelga added a commit to raelga/hypershift that referenced this pull request May 18, 2026
… overrides

Cover the two properties highlighted in review feedback on openshift#8509:

  * NewWithRegistryOverrides must not mutate either its overrides argument
    or the source release image's ComponentImages map. This is the contract
    that lets multiple callers share the same release image and overrides
    map without surprises.
  * Applying the same overrides twice must be a no-op (idempotency).
    Regressions here would manifest as compounding rewrites, e.g.
    mirror.example.com/quay-cache/mirror.example.com/quay-cache/...

Also adds a focused longest-prefix-wins assertion at this layer (the
shared helper already has full coverage for the matching algorithm, but
exercising it through the imageprovider keeps the layers honest).

Related: OCPBUGS-85585

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
raelga added a commit to raelga/hypershift that referenced this pull request May 18, 2026
…er tests

Flip the two registry-override mocks added by the previous commits from
map[string]string{} to a non-empty override so the full Reconcile path
actually walks the override-application logic at
hostedcontrolplane_controller.go:1106 (and the related propagation through
configoperatorv2.NewComponent and the ignitionserver --registry-overrides
flag).

The release image fixture has no images matching the override key, so the
existing semantic assertions of these tests are unchanged: this is purely
added coverage of the override code path through reconcile, matching the
convention already used at lines 176 and 1040 of the same file. Addresses
review feedback on openshift#8509.

Related: OCPBUGS-85585

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@openshift-ci openshift-ci Bot added the area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release label May 18, 2026
@openshift-ci-robot

Copy link
Copy Markdown

@raelga: This pull request references Jira Issue OCPBUGS-85585, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (5.0.0) matches configured target version for branch (5.0.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)
Details

In response to this:

Summary

CPO does not propagate --registry-overrides to init containers it injects in HCP sub-resources. When CPO creates deployments like capi-provider, it injects availability-prober init containers using the original quay.io image reference instead of the overridden ACR path.

Root cause

In hostedcontrolplane_controller.go:1106, the releaseImageProvider is created with imageprovider.New(releaseImage) which returns component images as-is from the release image stream. The registry overrides available via r.ReleaseProvider.GetRegistryOverrides() are not applied to these component images.

Fix

Add NewWithRegistryOverrides() to the imageprovider package that applies registry overrides to all component images when creating the provider. Use it in the HostedControlPlane reconciler instead of imageprovider.New().

This ensures when CPO calls GetImage("availability-prober") (or any other component), the returned image reference has the registry override applied.

Impact

Without this fix, any ValidatingAdmissionPolicy that restricts container images to approved registries will block CPO init containers, causing HCP cluster creation to hang indefinitely.

Test plan

  • 4 new unit tests covering: overrides match, no overrides, non-matching overrides, multiple overrides
  • Existing imageprovider tests pass
  • E2E with --registry-overrides and a VAP in Deny mode

Related

Summary by CodeRabbit

  • New Features

  • Registry overrides are now applied when reconciling hosted control plane components; image references are remapped using configured registry prefixes with longest-prefix wins.

  • Remapping ignores non-matching prefixes and avoids accidental subdomain-only matches.

  • Tests

  • Added comprehensive tests validating registry override behavior, edge cases, idempotency, and non-mutation of inputs.

  • Updated controller tests to reflect non-empty registry-overrides in the mocked provider.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
control-plane-operator/controllers/hostedcontrolplane/hostedcontrolplane_controller_test.go (1)

730-731: ⚡ Quick win

Use a matching override key so this path validates real remapping.

On Line 731 and Line 816, the key "registry" is effectively a non-match for typical release image refs, so these tests still mostly cover the no-op path. Prefer a realistic key (for example quay.io) and assert at least one rewritten image path in the reconciled output.

Also applies to: 815-816

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@control-plane-operator/controllers/hostedcontrolplane/hostedcontrolplane_controller_test.go`
around lines 730 - 731, The test uses
mockedProviderWithOpenshiftImageRegistryOverrides.EXPECT().GetRegistryOverrides().Return(map[string]string{"registry":"override"})
which uses a non-matching key and therefore exercises the no-op path; change the
map key to a realistic registry host (e.g. "quay.io" or "registry.redhat.io") so
the remapping code is exercised, then update the test assertions in the same
test that call the reconciler to check that at least one image string in the
reconciled output has been rewritten (assert that an expected original prefix
like "quay.io" is replaced with "override" in the returned image list), leaving
the mock call name mockedProviderWithOpenshiftImageRegistryOverrides and its
GetRegistryOverrides() usage intact.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In
`@control-plane-operator/controllers/hostedcontrolplane/hostedcontrolplane_controller_test.go`:
- Around line 730-731: The test uses
mockedProviderWithOpenshiftImageRegistryOverrides.EXPECT().GetRegistryOverrides().Return(map[string]string{"registry":"override"})
which uses a non-matching key and therefore exercises the no-op path; change the
map key to a realistic registry host (e.g. "quay.io" or "registry.redhat.io") so
the remapping code is exercised, then update the test assertions in the same
test that call the reconciler to check that at least one image string in the
reconciled output has been rewritten (assert that an expected original prefix
like "quay.io" is replaced with "override" in the returned image list), leaving
the mock call name mockedProviderWithOpenshiftImageRegistryOverrides and its
GetRegistryOverrides() usage intact.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: dcb45b87-cb18-487c-9119-6692ce758ff6

📥 Commits

Reviewing files that changed from the base of the PR and between 6f542db and 477a9ac.

📒 Files selected for processing (6)
  • control-plane-operator/controllers/hostedcontrolplane/hostedcontrolplane_controller_test.go
  • control-plane-operator/controllers/hostedcontrolplane/imageprovider/imageprovider.go
  • control-plane-operator/controllers/hostedcontrolplane/imageprovider/imageprovider_test.go
  • support/releaseinfo/registry_mirror_provider.go
  • support/util/registryoverride/registryoverride.go
  • support/util/registryoverride/registryoverride_test.go

@raelga raelga changed the title OCPBUGS-85585: apply registry overrides to component images in CPO sub-resources OCPBUGS-85585: tighten registry override matching to strict longest-prefix across release-image consumers May 18, 2026
@openshift-merge-bot

Copy link
Copy Markdown
Contributor

Scheduling tests matching the pipeline_run_if_changed or not excluded by pipeline_skip_if_only_changed parameters:
/test e2e-aks
/test e2e-aws
/test e2e-aws-upgrade-hypershift-operator
/test e2e-azure-v2-self-managed
/test e2e-kubevirt-aws-ovn-reduced
/test e2e-v2-aws
/test e2e-v2-gke

@muraee

muraee commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

/approve

@openshift-ci

openshift-ci Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: muraee, raelga

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 22, 2026
@muraee

muraee commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

/verified by unit

@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label Jun 22, 2026
@openshift-ci-robot

Copy link
Copy Markdown

@muraee: This PR has been marked as verified by unit.

Details

In response to this:

/verified by unit

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci

openshift-ci Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

@raelga: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-merge-bot openshift-merge-bot Bot merged commit c4fd7dc into openshift:main Jun 22, 2026
41 checks passed
@openshift-ci-robot

Copy link
Copy Markdown

@raelga: Jira Issue Verification Checks: Jira Issue OCPBUGS-85585
✔️ This pull request was pre-merge verified.
✔️ All associated pull requests have merged.
✔️ All associated, merged pull requests were pre-merge verified.

Jira Issue OCPBUGS-85585 has been moved to the MODIFIED state and will move to the VERIFIED state when the change is available in an accepted nightly payload. 🕓

Details

In response to this:

Summary

The RegistryMirrorProviderDecorator used by Lookup() rewrites every
ImageStream tag with a bare strings.Replace(image, source, target, 1). This
matches anywhere in the image reference, so:

  • it can incorrectly rewrite substrings inside hostnames (e.g. quay.io
    inside quay.io.example.com),
  • it does not reliably rewrite all component images, leading to CPO
    sub-resources (e.g. the availability-prober init container injected into
    capi-provider and other deployments) still pointing at the original
    registry.

Downstream, this breaks clusters where a ValidatingAdmissionPolicy in Deny
mode only allows images from approved registries: HCP creation hangs
indefinitely.

Root cause

support/releaseinfo/registry_mirror_provider.go ran an unbounded
strings.Replace per (source, target) pair across the entire image string.
Match boundaries were neither anchored to the start of the reference nor to a
path separator, and iteration order over the override map was undefined, so
which override "won" depended on Go's randomized map iteration.

Fix

Extract a shared registryoverride.Replace(image, overrides) helper at
support/util/registryoverride/ and use it from both:

  1. RegistryMirrorProviderDecorator.Lookup() (the actual bug fix), and
  2. imageprovider.NewWithRegistryOverrides() (defense-in-depth: callers that
    construct a provider directly still get the same semantics).

The helper:

  • matches strictly: image == source or image has prefix source + "/"
    (no substring/false-host matches),
  • picks the longest matching source prefix (deterministic across map
    iteration order),
  • is a no-op when overrides are empty or no source matches.

imageprovider.New(releaseImage) now delegates to
NewWithRegistryOverrides(releaseImage, nil), and the controller wiring at
hostedcontrolplane_controller.go:1106 keeps the explicit
NewWithRegistryOverrides(..., r.ReleaseProvider.GetRegistryOverrides())
call so the override path is exercised even when callers don't go through
Lookup() first.

Impact

  • Component images in CPO sub-resources (init containers in capi-provider,
    ignition server, etc.) now reliably honor --registry-overrides.
  • VAPs in Deny mode no longer block HCP creation in air-gapped / mirrored
    environments.
  • Side effect: false-positive substring rewrites in the decorator (a
    pre-existing latent bug) are also fixed.

Test plan

  • Unit tests for registryoverride.Replace covering 12 cases incl. boundary,
    empty overrides, longest-prefix, no mutation of the overrides map.
  • imageprovider_test.go covers no-overrides, no-match, single-match,
    multi-override, longest-prefix-wins, mutation safety (release-image and
    overrides untouched), and idempotency (no double-apply when image was
    already mirrored).
  • hostedcontrolplane_controller_test.go (TestEventHandling,
    TestNonReadyInfraTriggersRequeueAfter) now exercise the override path
    with non-empty overrides through full reconcile.
  • E2E with --registry-overrides + VAP in Deny mode (manual).

Related

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-merge-robot

Copy link
Copy Markdown
Contributor

Fix included in release 5.0.0-0.nightly-2026-06-22-235733

openshift-merge-bot Bot pushed a commit to Azure/ARO-HCP that referenced this pull request Jun 25, 2026
…ROSLSRE-1318)

The latest HyperShift operator image includes a regression in
registryoverride.Replace (openshift/hypershift#8509) that breaks
repository-level --registry-overrides for digest-based images.
This causes all CAPI and component image rewrites to fail in
environments using ACR mirrors, blocking cluster creation.

Pin to the known-good build (a101e669, 2026-06-05) until the
upstream fix (openshift/hypershift#8824, OCPBUGS-92034) merges
and a new image is published.
@avollmer-redhat

Copy link
Copy Markdown

/cherrypick release-4.22 release-4.21 release-4.20

@openshift-cherrypick-robot

Copy link
Copy Markdown

@avollmer-redhat: new pull request created: #8873

Details

In response to this:

/cherrypick release-4.22 release-4.21 release-4.20

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

avollmer-redhat added a commit to avollmer-redhat/hypershift that referenced this pull request Jun 30, 2026
… containers

Backports openshift#8509 and openshift#8824 to release-4.22:

1. Add support/util/registryoverride package with strict longest-prefix
   matching that correctly handles digest (@sha256:) and tag (:) separators,
   preventing false substring matches (e.g. "quay.io" matching
   "quay.io.example.com").

2. Fix RegistryMirrorProviderDecorator.Lookup to use registryoverride.Replace
   instead of strings.Replace, eliminating the original substring-match bug.

3. Add imageprovider.NewWithRegistryOverrides to apply registry overrides to
   all component images at provider creation time, ensuring init containers
   (availability-prober) and other CPO sub-resources use overridden images.

4. Wire NewWithRegistryOverrides into the HCP controller reconcile loop so
   the control-plane release image provider applies overrides.

Without this fix, CPO-managed init containers (e.g. availability-prober)
retain original registry references, causing ValidatingAdmissionPolicies in
Deny mode to block HCP creation in environments that restrict image sources.
avollmer-redhat added a commit to avollmer-redhat/hypershift that referenced this pull request Jun 30, 2026
… containers

Backports openshift#8509 and openshift#8824 to release-4.21:

1. Add support/util/registryoverride package with strict longest-prefix
   matching that correctly handles digest (@sha256:) and tag (:) separators,
   preventing false substring matches (e.g. "quay.io" matching
   "quay.io.example.com").

2. Fix RegistryMirrorProviderDecorator.Lookup to use registryoverride.Replace
   instead of strings.Replace, eliminating the original substring-match bug.

3. Add imageprovider.NewWithRegistryOverrides to apply registry overrides to
   all component images at provider creation time, ensuring init containers
   (availability-prober) and other CPO sub-resources use overridden images.

4. Wire NewWithRegistryOverrides into the HCP controller reconcile loop so
   the control-plane release image provider applies overrides.

Without this fix, CPO-managed init containers (e.g. availability-prober)
retain original registry references, causing ValidatingAdmissionPolicies in
Deny mode to block HCP creation in environments that restrict image sources.
avollmer-redhat added a commit to avollmer-redhat/hypershift that referenced this pull request Jun 30, 2026
… containers

Backports openshift#8509 and openshift#8824 to release-4.20:

1. Add support/util/registryoverride package with strict longest-prefix
   matching that correctly handles digest (@sha256:) and tag (:) separators,
   preventing false substring matches (e.g. "quay.io" matching
   "quay.io.example.com").

2. Fix RegistryMirrorProviderDecorator.Lookup to use registryoverride.Replace
   instead of strings.Replace, eliminating the original substring-match bug.

3. Add imageprovider.NewWithRegistryOverrides to apply registry overrides to
   all component images at provider creation time, ensuring init containers
   (availability-prober) and other CPO sub-resources use overridden images.

4. Wire NewWithRegistryOverrides into the HCP controller reconcile loop so
   the control-plane release image provider applies overrides.

Without this fix, CPO-managed init containers (e.g. availability-prober)
retain original registry references, causing ValidatingAdmissionPolicies in
Deny mode to block HCP creation in environments that restrict image sources.
avollmer-redhat added a commit to avollmer-redhat/hypershift that referenced this pull request Jul 1, 2026
…t containers

Combined manual backport of openshift#8509 and openshift#8824 to release-4.21.
Introduces strict longest-prefix registry override matching and
wires overrides into CPO init container image resolution.
avollmer-redhat added a commit to avollmer-redhat/hypershift that referenced this pull request Jul 1, 2026
…t containers

Combined manual backport of openshift#8509 and openshift#8824 to release-4.20.
Introduces strict longest-prefix registry override matching and
wires overrides into CPO init container image resolution.
avollmer-redhat added a commit to avollmer-redhat/hypershift that referenced this pull request Jul 1, 2026
…t containers

Combined manual backport of openshift#8509 and openshift#8824 to release-4.22.
Introduces strict longest-prefix registry override matching and
wires overrides into CPO init container image resolution.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/control-plane-operator Indicates the PR includes changes for the control plane operator - in an OCP release area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. verified Signifies that the PR passed pre-merge verification criteria

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants