Skip to content

OCPBUGS-87889: fall back to kube-system/global-pull-secret for Insights token#1302

Merged
openshift-merge-bot[bot] merged 2 commits into
openshift:masterfrom
judexzhu:feat/global-pull-secret-fallback
Jun 10, 2026
Merged

OCPBUGS-87889: fall back to kube-system/global-pull-secret for Insights token#1302
openshift-merge-bot[bot] merged 2 commits into
openshift:masterfrom
judexzhu:feat/global-pull-secret-fallback

Conversation

@judexzhu

@judexzhu judexzhu commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

Summary

On ARO HCP clusters, openshift-config/pull-secret only contains the ACR registry credential — no cloud.openshift.com token. Customers add their Red Hat pull secret (including cloud.openshift.com) day-2 via kube-system/additional-pull-secret, which HCCO merges into kube-system/global-pull-secret. The Insights Operator currently only checks openshift-config/pull-secret, so it reports NoToken / GatheringDisabled even though the token exists on the cluster.

This change makes updateToken() check kube-system/global-pull-secret as a read-only fallback when openshift-config/pull-secret has no cloud.openshift.com token.

Changes:

  • Generalize fetchSecret() to accept a namespace parameter
  • Add fallback lookup to kube-system/global-pull-secret in updateToken()
  • Add read-only RBAC (get only, no update/patch) for global-pull-secret in kube-system
  • Include namespace in fetchSecret log/error messages for debuggability
  • Add tests for fallback and primary-wins-over-fallback precedence

Behavior:

  • On standard OpenShift clusters: no change — openshift-config/pull-secret has the token, fallback is never reached
  • On ARO HCP clusters: finds the token in kube-system/global-pull-secret when primary lacks cloud.openshift.com
  • global-pull-secret is read-only — the operator does not manage, write, or claim ownership of this secret
  • If global-pull-secret doesn't exist (NotFound) or is inaccessible (Forbidden), the fallback is silently skipped

Test plan

  • make test — all unit tests pass
  • make lint — 0 issues
  • New test: fallback finds token from kube-system/global-pull-secret when primary lacks cloud.openshift.com
  • New test: primary openshift-config/pull-secret takes precedence when both have cloud.openshift.com
  • Verify on ARO HCP test cluster that operator picks up token from global-pull-secret

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Configuration
    • Added scoped permissions and a retrieval fallback to a global pull secret so reporting can use an alternate pull secret when the primary is unavailable.
  • Tests
    • Added test cases to validate token selection and priority between the primary pull secret and the global fallback.

On ARO HCP clusters, openshift-config/pull-secret only contains the ACR
registry credential — no cloud.openshift.com token. Customers add their
Red Hat pull secret (including cloud.openshift.com) day-2 via the
additional-pull-secret method, which HCCO merges into
kube-system/global-pull-secret.

This change makes updateToken() check kube-system/global-pull-secret as
a fallback when openshift-config/pull-secret has no cloud.openshift.com
token, enabling Insights reporting on HCP clusters without requiring
platform-level changes.

Changes:
- Generalize fetchSecret() to accept a namespace parameter
- Add fallback lookup to kube-system/global-pull-secret in updateToken()
- Add read-only RBAC (Role+RoleBinding) for global-pull-secret in kube-system
- Include namespace in fetchSecret log/error messages for debuggability
- Add tests for fallback and primary-wins-over-fallback precedence

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 7, 2026

Copy link
Copy Markdown

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: a203750b-e09c-461b-b138-6cdb5099fe8a

📥 Commits

Reviewing files that changed from the base of the PR and between 2af70f3 and dc50954.

📒 Files selected for processing (1)
  • pkg/config/configobserver/secretconfigobserver.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • pkg/config/configobserver/secretconfigobserver.go

📝 Walkthrough

Walkthrough

The PR adds fallback token resolution by enabling the insights-operator to read a global pull secret from kube-system when the primary pull secret is unavailable. RBAC permissions are added, secret fetching is generalized to support namespace-aware lookups, token fallback logic is implemented, and token selection is tested for both presence and precedence scenarios.

Changes

Global Pull Secret Fallback

Layer / File(s) Summary
RBAC role and binding for global pull secret access
manifests/03-clusterrole.yaml
New Role insights-operator-pull-secret in kube-system grants get permission on the global-pull-secret Secret, with matching RoleBinding binding it to openshift-insights/operator ServiceAccount.
Secret lookup constants
pkg/config/configobserver/secretconfigobserver.go
Adds constants for pull/support secret namespaces and names including kube-system/global-pull-secret.
Generalized secret fetching
pkg/config/configobserver/secretconfigobserver.go
fetchSecret refactored to accept namespace and name; logs and errors include namespace/name context.
Primary pull-secret lookup in updateToken
pkg/config/configobserver/secretconfigobserver.go
updateToken now fetches openshift-config/pull-secret via the new fetchSecret.
Global fallback in updateToken
pkg/config/configobserver/secretconfigobserver.go
If primary pull-secret yields no token, updateToken fetches kube-system/global-pull-secret and sets nextConfig.Token and nextConfig.Report when a token is found.
Support secret lookup in updateConfig
pkg/config/configobserver/secretconfigobserver.go
updateConfig updated to fetch openshift-config/support using the generalized fetchSecret.
Token selection test coverage
pkg/config/configobserver/secretconfigobserver_test.go
Adds globalPullSecretKey constant and two test cases verifying fallback and precedence between primary and global pull secrets.

Sequence Diagram

sequenceDiagram
  participant updateToken
  participant fetchSecret
  participant nextConfig
  updateToken->>fetchSecret: fetch openshift-config/pull-secret
  fetchSecret-->>updateToken: token or empty
  alt Token found
    updateToken->>nextConfig: set Token, enable Report
  else Token empty
    updateToken->>fetchSecret: fetch kube-system/global-pull-secret
    fetchSecret-->>updateToken: token or empty
    alt Global token found
      updateToken->>nextConfig: set Token, enable Report
    else Global token empty
      updateToken->>nextConfig: leave Token empty
    end
  end
Loading

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 5 | ❌ 10

❌ Failed checks (10 inconclusive)

Check name Status Explanation Resolution
Stable And Deterministic Test Names ❓ Inconclusive Repository clone failed, so this custom check could not run with code access. Retry the review run. If this persists, inspect pre-merge custom-check logs for infrastructure or agent runtime failures.
Test Structure And Quality ❓ Inconclusive Repository clone failed, so this custom check could not run with code access. Retry the review run. If this persists, inspect pre-merge custom-check logs for infrastructure or agent runtime failures.
Microshift Test Compatibility ❓ Inconclusive Repository clone failed, so this custom check could not run with code access. Retry the review run. If this persists, inspect pre-merge custom-check logs for infrastructure or agent runtime failures.
Single Node Openshift (Sno) Test Compatibility ❓ Inconclusive Repository clone failed, so this custom check could not run with code access. Retry the review run. If this persists, inspect pre-merge custom-check logs for infrastructure or agent runtime failures.
Topology-Aware Scheduling Compatibility ❓ Inconclusive Repository clone failed, so this custom check could not run with code access. Retry the review run. If this persists, inspect pre-merge custom-check logs for infrastructure or agent runtime failures.
Ote Binary Stdout Contract ❓ Inconclusive Repository clone failed, so this custom check could not run with code access. Retry the review run. If this persists, inspect pre-merge custom-check logs for infrastructure or agent runtime failures.
Ipv6 And Disconnected Network Test Compatibility ❓ Inconclusive Repository clone failed, so this custom check could not run with code access. Retry the review run. If this persists, inspect pre-merge custom-check logs for infrastructure or agent runtime failures.
No-Weak-Crypto ❓ Inconclusive Repository clone failed, so this custom check could not run with code access. Retry the review run. If this persists, inspect pre-merge custom-check logs for infrastructure or agent runtime failures.
Container-Privileges ❓ Inconclusive Repository clone failed, so this custom check could not run with code access. Retry the review run. If this persists, inspect pre-merge custom-check logs for infrastructure or agent runtime failures.
No-Sensitive-Data-In-Logs ❓ Inconclusive Repository clone failed, so this custom check could not run with code access. Retry the review run. If this persists, inspect pre-merge custom-check logs for infrastructure or agent runtime failures.
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: adding fallback logic to check kube-system/global-pull-secret for the Insights token when openshift-config/pull-secret is unavailable.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

Tools execution failed with the following error:

Failed to run tools: 13 INTERNAL: Received RST_STREAM with code 2 (Internal server error)


Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci Bot requested review from katushiik11 and ncaak June 7, 2026 21:24
@judexzhu

judexzhu commented Jun 8, 2026

Copy link
Copy Markdown
Contributor Author

/retest-required

@opokornyy

Copy link
Copy Markdown
Contributor

/retest

@opokornyy

Copy link
Copy Markdown
Contributor

/cc

@openshift-ci openshift-ci Bot requested a review from opokornyy June 8, 2026 15:06
@judexzhu

judexzhu commented Jun 8, 2026

Copy link
Copy Markdown
Contributor Author

Seems the failed job not relate to my PR change?

identical failure: Got an unexpected keyword argument 'watch' to method read_namespaced_pod_log and TypeError: 'NoneType' object is not iterable

@opokornyy

Copy link
Copy Markdown
Contributor

Seems the failed job not relate to my PR change?

identical failure: Got an unexpected keyword argument 'watch' to method read_namespaced_pod_log and TypeError: 'NoneType' object is not iterable

yeah, there is an issue in the python kubernetes client that our tests are using, I will override the test failure kubernetes-client/python#2610

func (c *Controller) updateToken(ctx context.Context) error {
klog.V(2).Infof("Refreshing configuration from cluster pull secret")
secret, err := c.fetchSecret(ctx, "pull-secret")
secret, err := c.fetchSecret(ctx, "openshift-config", "pull-secret")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you make the namespace names and secret names constants?

@judexzhu judexzhu Jun 9, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please take a look again. thank you

@opokornyy

Copy link
Copy Markdown
Contributor

/retitle OCPBUGS-87889: fall back to kube-system/global-pull-secret for Insights token

@openshift-ci openshift-ci Bot changed the title feat: fall back to kube-system/global-pull-secret for Insights token OCPBUGS-87889: fall back to kube-system/global-pull-secret for Insights token Jun 9, 2026
@openshift-ci-robot openshift-ci-robot added jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. labels Jun 9, 2026
@openshift-ci-robot

Copy link
Copy Markdown
Contributor

@judexzhu: This pull request references Jira Issue OCPBUGS-87889, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (5.0.0) matches configured target version for branch (5.0.0)
  • bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

Summary

On ARO HCP clusters, openshift-config/pull-secret only contains the ACR registry credential — no cloud.openshift.com token. Customers add their Red Hat pull secret (including cloud.openshift.com) day-2 via kube-system/additional-pull-secret, which HCCO merges into kube-system/global-pull-secret. The Insights Operator currently only checks openshift-config/pull-secret, so it reports NoToken / GatheringDisabled even though the token exists on the cluster.

This change makes updateToken() check kube-system/global-pull-secret as a read-only fallback when openshift-config/pull-secret has no cloud.openshift.com token.

Changes:

  • Generalize fetchSecret() to accept a namespace parameter
  • Add fallback lookup to kube-system/global-pull-secret in updateToken()
  • Add read-only RBAC (get only, no update/patch) for global-pull-secret in kube-system
  • Include namespace in fetchSecret log/error messages for debuggability
  • Add tests for fallback and primary-wins-over-fallback precedence

Behavior:

  • On standard OpenShift clusters: no change — openshift-config/pull-secret has the token, fallback is never reached
  • On ARO HCP clusters: finds the token in kube-system/global-pull-secret when primary lacks cloud.openshift.com
  • global-pull-secret is read-only — the operator does not manage, write, or claim ownership of this secret
  • If global-pull-secret doesn't exist (NotFound) or is inaccessible (Forbidden), the fallback is silently skipped

Test plan

  • make test — all unit tests pass
  • make lint — 0 issues
  • New test: fallback finds token from kube-system/global-pull-secret when primary lacks cloud.openshift.com
  • New test: primary openshift-config/pull-secret takes precedence when both have cloud.openshift.com
  • Verify on ARO HCP test cluster that operator picks up token from global-pull-secret

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes

  • Configuration
  • Updated pull secret access permissions and retrieval logic with fallback support.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Addresses review feedback to replace hardcoded string literals with
named constants for better readability and maintainability.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@openshift-ci-robot

Copy link
Copy Markdown
Contributor

@judexzhu: This pull request references Jira Issue OCPBUGS-87889, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (5.0.0) matches configured target version for branch (5.0.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)
Details

In response to this:

Summary

On ARO HCP clusters, openshift-config/pull-secret only contains the ACR registry credential — no cloud.openshift.com token. Customers add their Red Hat pull secret (including cloud.openshift.com) day-2 via kube-system/additional-pull-secret, which HCCO merges into kube-system/global-pull-secret. The Insights Operator currently only checks openshift-config/pull-secret, so it reports NoToken / GatheringDisabled even though the token exists on the cluster.

This change makes updateToken() check kube-system/global-pull-secret as a read-only fallback when openshift-config/pull-secret has no cloud.openshift.com token.

Changes:

  • Generalize fetchSecret() to accept a namespace parameter
  • Add fallback lookup to kube-system/global-pull-secret in updateToken()
  • Add read-only RBAC (get only, no update/patch) for global-pull-secret in kube-system
  • Include namespace in fetchSecret log/error messages for debuggability
  • Add tests for fallback and primary-wins-over-fallback precedence

Behavior:

  • On standard OpenShift clusters: no change — openshift-config/pull-secret has the token, fallback is never reached
  • On ARO HCP clusters: finds the token in kube-system/global-pull-secret when primary lacks cloud.openshift.com
  • global-pull-secret is read-only — the operator does not manage, write, or claim ownership of this secret
  • If global-pull-secret doesn't exist (NotFound) or is inaccessible (Forbidden), the fallback is silently skipped

Test plan

  • make test — all unit tests pass
  • make lint — 0 issues
  • New test: fallback finds token from kube-system/global-pull-secret when primary lacks cloud.openshift.com
  • New test: primary openshift-config/pull-secret takes precedence when both have cloud.openshift.com
  • Verify on ARO HCP test cluster that operator picks up token from global-pull-secret

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Configuration
  • Added scoped permissions and a retrieval fallback to a global pull secret so reporting can use an alternate pull secret when the primary is unavailable.
  • Tests
  • Added test cases to validate token selection and priority between the primary pull secret and the global fallback.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@judexzhu

judexzhu commented Jun 9, 2026

Copy link
Copy Markdown
Contributor Author

/test insights-operator-serial-tests

@opokornyy

Copy link
Copy Markdown
Contributor

/override ci/prow/insights-operator-e2e-tests

@openshift-ci

openshift-ci Bot commented Jun 10, 2026

Copy link
Copy Markdown

@opokornyy: Overrode contexts on behalf of opokornyy: ci/prow/insights-operator-e2e-tests

Details

In response to this:

/override ci/prow/insights-operator-e2e-tests

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-ci

openshift-ci Bot commented Jun 10, 2026

Copy link
Copy Markdown

@judexzhu: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@opokornyy

Copy link
Copy Markdown
Contributor

/lgtm

@opokornyy

Copy link
Copy Markdown
Contributor

/approve

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Jun 10, 2026
@openshift-ci

openshift-ci Bot commented Jun 10, 2026

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: judexzhu, opokornyy

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 10, 2026
@opokornyy

Copy link
Copy Markdown
Contributor

/verified by @opokornyy

@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label Jun 10, 2026
@openshift-ci-robot

Copy link
Copy Markdown
Contributor

@opokornyy: This PR has been marked as verified by @opokornyy.

Details

In response to this:

/verified by @opokornyy

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-merge-bot openshift-merge-bot Bot merged commit 314becc into openshift:master Jun 10, 2026
13 checks passed
@openshift-ci-robot

Copy link
Copy Markdown
Contributor

@judexzhu: Jira Issue Verification Checks: Jira Issue OCPBUGS-87889
✔️ This pull request was pre-merge verified.
✔️ All associated pull requests have merged.
✔️ All associated, merged pull requests were pre-merge verified.

Jira Issue OCPBUGS-87889 has been moved to the MODIFIED state and will move to the VERIFIED state when the change is available in an accepted nightly payload. 🕓

Details

In response to this:

Summary

On ARO HCP clusters, openshift-config/pull-secret only contains the ACR registry credential — no cloud.openshift.com token. Customers add their Red Hat pull secret (including cloud.openshift.com) day-2 via kube-system/additional-pull-secret, which HCCO merges into kube-system/global-pull-secret. The Insights Operator currently only checks openshift-config/pull-secret, so it reports NoToken / GatheringDisabled even though the token exists on the cluster.

This change makes updateToken() check kube-system/global-pull-secret as a read-only fallback when openshift-config/pull-secret has no cloud.openshift.com token.

Changes:

  • Generalize fetchSecret() to accept a namespace parameter
  • Add fallback lookup to kube-system/global-pull-secret in updateToken()
  • Add read-only RBAC (get only, no update/patch) for global-pull-secret in kube-system
  • Include namespace in fetchSecret log/error messages for debuggability
  • Add tests for fallback and primary-wins-over-fallback precedence

Behavior:

  • On standard OpenShift clusters: no change — openshift-config/pull-secret has the token, fallback is never reached
  • On ARO HCP clusters: finds the token in kube-system/global-pull-secret when primary lacks cloud.openshift.com
  • global-pull-secret is read-only — the operator does not manage, write, or claim ownership of this secret
  • If global-pull-secret doesn't exist (NotFound) or is inaccessible (Forbidden), the fallback is silently skipped

Test plan

  • make test — all unit tests pass
  • make lint — 0 issues
  • New test: fallback finds token from kube-system/global-pull-secret when primary lacks cloud.openshift.com
  • New test: primary openshift-config/pull-secret takes precedence when both have cloud.openshift.com
  • Verify on ARO HCP test cluster that operator picks up token from global-pull-secret

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Configuration
  • Added scoped permissions and a retrieval fallback to a global pull secret so reporting can use an alternate pull secret when the primary is unavailable.
  • Tests
  • Added test cases to validate token selection and priority between the primary pull secret and the global fallback.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@opokornyy

Copy link
Copy Markdown
Contributor

/cherry-pick release-4.22

@openshift-cherrypick-robot

Copy link
Copy Markdown

@opokornyy: new pull request created: #1305

Details

In response to this:

/cherry-pick release-4.22

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-merge-robot

Copy link
Copy Markdown
Contributor

Fix included in release 5.0.0-0.nightly-2026-06-12-141614

@opokornyy

Copy link
Copy Markdown
Contributor

/cherry-pick release-4.21

@openshift-cherrypick-robot

Copy link
Copy Markdown

@opokornyy: new pull request created: #1310

Details

In response to this:

/cherry-pick release-4.21

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@opokornyy

Copy link
Copy Markdown
Contributor

/cherry-pick release-4.20

@openshift-cherrypick-robot

Copy link
Copy Markdown

@opokornyy: new pull request created: #1311

Details

In response to this:

/cherry-pick release-4.20

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. verified Signifies that the PR passed pre-merge verification criteria

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants