Skip to content

CNTRLPLANE-3740: Add hypershift details for additional identity information sources#2050

Open
liouk wants to merge 1 commit into
openshift:masterfrom
liouk:external-oidc-additional-identity-information-sources-hypershift
Open

CNTRLPLANE-3740: Add hypershift details for additional identity information sources#2050
liouk wants to merge 1 commit into
openshift:masterfrom
liouk:external-oidc-additional-identity-information-sources-hypershift

Conversation

@liouk

@liouk liouk commented Jun 29, 2026

Copy link
Copy Markdown
Member

No description provided.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jun 29, 2026
@openshift-ci-robot

openshift-ci-robot commented Jun 29, 2026

Copy link
Copy Markdown

@liouk: This pull request references CNTRLPLANE-3740 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the sub-task to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci

openshift-ci Bot commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign cybertron for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci

openshift-ci Bot commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

@liouk: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@liouk liouk changed the title CNTRLPLANE-3740: Add hypershift details for additional identity information sources WIP: CNTRLPLANE-3740: Add hypershift details for additional identity information sources Jun 29, 2026
@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 29, 2026
@liouk liouk changed the title WIP: CNTRLPLANE-3740: Add hypershift details for additional identity information sources CNTRLPLANE-3740: Add hypershift details for additional identity information sources Jun 29, 2026
@openshift-ci openshift-ci Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 29, 2026
@liouk

liouk commented Jun 29, 2026

Copy link
Copy Markdown
Member Author

This PR currently contains a set of open questions; my goal is to resolve these before merging and adjust the PR accordingly. Putting a hold until these are resolved.

/hold

@openshift-ci openshift-ci Bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 29, 2026
with upstream `v1beta1`, there's no type mismatch on the kubeapiserver generator output. The oauth-apiserver generator produces the oauth-apiserver's own `AuthenticationConfiguration` type,
which both topologies would need to depend on regardless.

To make this work we would need to do some minor refactoring/abstractions to the generators before extracting to a lib, such as CA cert & secret resolution and feature gate checks.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was my concern going through the review - feature gates. Because in order for feature gates to work on hypershift, we need to propagate CPO's feature gates into hypershift-operator which will propagate through to CPO itself. That's what I'm doing in this PR .

So, soon enough, we will also have this dependency on CPO for feature gate propagation through hypershift-operator. It's probably not something we can't fix - maybe we can abstract these feature gates away from here for this proposal. Just wanted to point that out. :)

@everettraven everettraven left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, I think this looks pretty good. A handful of comments on the open questions.

Comment on lines +571 to +573
When configuring the `authentication.config.openshift.io/cluster` resource to use the External OIDC feature in Hypershift,
the goal is the same: configure the oauth-apiserver to use its new external OIDC operation mode and enable it on the kube-apiserver
as a webhook authenticator. The oauth-server is still unused in this new mode of operation.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In hypershift, I think this configuration is done on the HostedCluster API. Let's make sure that is properly reflected here to prevent any confusion related to where things are configured on standalone vs hypershift.

Comment on lines +581 to +588
- Update the deployment predicate of the `openshift-oauth-apiserver` component to always create the oauth-apiserver deployment
(instead of only when authentication type is `IntegratedOAuth`).
- Update the deployment adapt function to configure the oauth-apiserver in the new external OIDC operation mode when authentication
type is `OIDC` in the `authentication.config.openshift.io/cluster` resource.
- When authentication type is `OIDC` in the `authentication.config.openshift.io/cluster` resource, the manifest adapter for the
`auth-config.yaml` manifest generates the required AuthenticationConfiguration object and serializes it into the manifest. With
the new mode, the manifest adapter will need to generate the updated AuthenticationConfiguration API introduced with this EP.
- Ensure that secrets required by the oauth-apiserver (e.g. `clientCredential.secret`, TLS CA bundles) are available on the management cluster and projected into the oauth-apiserver pod.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this isn't necessarily a definitive yet, we have been considering having a net-new component instead of the oauth-apiserver in the future.

Does this approach account for the fact that we may end up needing to adopt the use of the net-new component in the future?

Maybe we should consider having this be a net-new component from the perspective of the CPO, but it just deploys the oauth-apiserver in the new mode? This might be a bit more work now, but would probably be easier to modify in the future if we do end up with a new component for this in the future.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did consider this initially, but decided against this approach until we'll need the new component, mainly because the suggested approach matches the reality of our current architecture naturally, and because for the moment there's really no concrete plan of whether/when we will go there.

In a nutshell, here's what we'll need for a new component:

  • the framework won't allow multiple components with the same name
  • currently, the openshift-oauth-apiserver component is hardwired in multiple places (service, certs, webhook)
  • this means that we'll end up needing multiple new components and boilerplate (deployment, service, cert, webhook config, PDB, ...) and at the same time change how dependencies use the oauth-apiserver component

This feels like we'll be doing additional work which won't make sense until we need a completely separate new component; given that there's no concrete plan for that yet, the current suggested approach matches more closely the architecture we have in place in my view.

If you feel strongly about it, I'd be on board to go down that path, although I believe we'll benefit little in the future from that change, while we're increasing the current burden.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am currently doing a POC with both options in order to get a more detailed feeling of what either one would look like and what we'd get -- I'll report back with my findings.

Comment on lines +618 to +622
##### Open Question: Compatibility across multiple Kubernetes API Servers

At the current state of HyperShift and the CPO v2, it seems like the CPO is baked into the payload, which means that the kube-apiserver version should be on par with CPO
within a payload. However, there has been discussion of evidence that this is not always true, and that we might still need to maintain compatibility with multiple
kube-apiserver versions. We need to clarify this before proceeding with decisions on how to manage the API types going forward.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think some of the concern here may stem from the validation we added to prevent invalid KAS configurations where possible since HCP kind of eats KAS rollout failures IIRC.

More specifically: https://github.com/openshift/hypershift/blob/2d2b2d0805d36dcf401fdb5f3d913b9f7984ce42/support/validations/authentication.go#L51-L75

Maybe we ought to try to resolve that TODO in there to help have a more consistent validation pattern with the desired OCP installation version rather than always assuming the lowest possible version.

Additionally, the oauth-apiserver introduces its own `AuthenticationConfiguration` type (`externaloidc/apis/authentication/v1alpha1`) for its `external-oidc` mode config
file format, which extends the upstream structure with `externalClaimsSources` and other structural differences (pointer fields, `omitempty` on `JWT`).

##### Open Question: Replacing the internal copy with the upstream API

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this isn't causing any problems with the existing approach, should we just ignore this for now?

I think what we really need to be concerned about is potentially being OCP-version aware in our generation logic to prevent generating the oauth-apiserver config when we should be generating the KAS config instead.

Comment on lines +631 to +634
To simplify code maintenance and feature parity between the two topologies, we could extract the generators used in CAO to a library and reuse them in HyperShift;
both consume the same input (`configv1.Authentication`), both would produce `runtime.Object` through the same interface, and if we were to replace the internal type
with upstream `v1beta1`, there's no type mismatch on the kubeapiserver generator output. The oauth-apiserver generator produces the oauth-apiserver's own `AuthenticationConfiguration` type,
which both topologies would need to depend on regardless.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm +1 for re-use of existing logic and extracting to a library where we can.

which both topologies would need to depend on regardless.

To make this work we would need to do some minor refactoring/abstractions to the generators before extracting to a lib, such as CA cert & secret resolution and feature gate checks.
Feature gate checking also needs abstraction, as CAO and HyperShift use different feature gate interfaces and may not share the same gate names for equivalent functionality.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's true that gate names could differ, but in practice they shouldn't. I think there is desire to eventually align HyperShift's feature-gating behavior to align with standalone and actually use the o/api defined gate states.

That being said, we can ensure we only ever use the exact same gate names here.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree overall, and indeed currently there's no such divergence -- it just felt more correct from a design perspective to make no such assumption for specific feature gates that control generation of specific fields, especially given that there's no actual link between HCP's and OCP's feature gate definitions.

I'll transform this to a note more in the style of a requirement rather than something we'll need to bake in 👍

Comment on lines +638 to +640
There is also a difference in how HyperShift and Standalone validate the generated configuration. CAO does inline validation during generation (CEL expression compilation, `email_verified` enforcement
when `claims.email` is used in username, SA issuer URL overlap check, CA cert reachability via HTTP to OIDC discovery endpoint), while HyperShift defers to
upstream `ValidateAuthenticationConfiguration` which does not cover these checks. If generators move to a shared lib, these validations should follow so both topologies benefit.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where possible, I'd like to get the CAO aligned with HyperShift here and use the "upstream" validations (both in the KAS configuration and the oauth-apiserver configuration side of things - these are "upstream" of the $things configuring them).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants