Bug OCPBUGS-94106: Fall back to TLSProfileIntermediateType ciphers when observedConfig is empty during bootstrap#1645
Conversation
…en observedConfig is empty during bootstrap During bootstrap, the config observation controller hasn't converged yet, so observedConfig.servingInfo.cipherSuites and minTLSVersion are empty. This caused getCipherSuites() and getObservedTLSMinVersion() to hard-error, preventing the etcd static pod from rendering. Fall back to TLSProfileIntermediateType defaults (matching the render path in pkg/cmd/render/env.go:getTLSCipherSuites) when the observed values are empty, and log a warning so the fallback is auditable. The fallback only triggers when the config is genuinely empty — observedConfig with unrecognized ciphers still errors as before. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
@redhat-chai-bot: This pull request references Jira Issue OCPBUGS-94106, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
WalkthroughAdds bootstrap-time fallback logic in Bootstrap TLS Fallback
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 14 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (14 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.12.2)Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
pkg/etcdenvvar/etcd_env_test.go (1)
14-72: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winAdd direct coverage for the empty-
minTLSVersionfallback.The PR also adds a fallback in
getObservedTLSMinVersion(empty → TLS 1.2), but no case assertsETCD_TLS_MIN_VERSION. The empty-observedConfigcase exercises it only indirectly viagetCipherSuitesand never checks the resulting min-version value. Consider a table case (or a sibling test ongetTLSMinVersion) asserting the TLS 1.2 fallback output.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/etcdenvvar/etcd_env_test.go` around lines 14 - 72, Add direct test coverage for the empty min-TLS-version fallback in getObservedTLSMinVersion / getTLSMinVersion, since the current TestGetCipherSuites only exercises it indirectly. Extend the existing table or add a sibling test that explicitly asserts the ETCD_TLS_MIN_VERSION output is TLS 1.2 when observedConfig has no minTLSVersion, while keeping the current cipher-suite cases unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@pkg/etcdenvvar/etcd_env_test.go`:
- Around line 14-72: Add direct test coverage for the empty min-TLS-version
fallback in getObservedTLSMinVersion / getTLSMinVersion, since the current
TestGetCipherSuites only exercises it indirectly. Extend the existing table or
add a sibling test that explicitly asserts the ETCD_TLS_MIN_VERSION output is
TLS 1.2 when observedConfig has no minTLSVersion, while keeping the current
cipher-suite cases unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: a98696eb-47b7-484e-8268-f78b7e1055c7
📒 Files selected for processing (2)
pkg/etcdenvvar/etcd_env.gopkg/etcdenvvar/etcd_env_test.go
|
@redhat-chai-bot: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
| // During bootstrap the config observation controller hasn't converged yet, | ||
| // so observedConfig.servingInfo.minTLSVersion will be empty. Fall back to | ||
| // TLSProfileIntermediateType defaults (TLS 1.2). | ||
| if observedMinTLSVersion == "" { |
There was a problem hiding this comment.
This check isn't really necessary, crypto.TLSVersion returns TLSVersion12 if the input is empty: https://github.com/openshift/library-go/blob/257053230f0be3e7325f38dc00d98147c7f77e16/pkg/crypto/crypto.go#L86
Summary
During bootstrap,
observedConfigis initially empty because the config observation controller hasn't converged yet. ThegetCipherSuites()function inpkg/etcdenvvar/etcd_env.gohard-errors whenobservedConfig.servingInfo.cipherSuitesis empty, causing the EnvVarController to report Degraded. This blocks the InstallerController from creating the first etcd static pod revision, and if the config observer doesn't converge fast enough, bootstrap times out with zero control-plane etcd members started.This has been observed as an intermittent (~19%) bootstrap failure in OCP 5.0 nightly CI since June 25, 2026.
Fixes: https://redhat.atlassian.net/browse/OCPBUGS-94106
Changes
pkg/etcdenvvar/etcd_env.go(+20, -1):getCipherSuites(): WhenobservedCipherSuitesis empty (config observer hasn't converged), fall back toTLSProfileIntermediateTypecipher suites — the same defaults used by the render/bootstrap path inpkg/cmd/render/env.go:getTLSCipherSuites(). Logs a warning when falling back. If ciphers were present but none were supported by etcd, the function still errors (distinguishing bootstrap-empty from genuinely-bad-config).getObservedTLSMinVersion(): WhenobservedMinTLSVersionis empty, fall back to TLS 1.2 (theTLSProfileIntermediateTypedefault). Logs a warning.pkg/etcdenvvar/etcd_env_test.go(+98):TestGetCipherSuiteswith 3 table-driven test cases: (a) populated observedConfig returns expected ciphers, (b) empty observedConfig falls back to IntermediateType defaults, (c) observedConfig with unsupported ciphers still errors.Security Analysis
No security risk:
SupportedEtcdCiphers(same etcd cipher validation)observedConfigwould eventually containobservedConfigconverges, the normal path takes overEvidence
waiting on condition EtcdRunningInClusterbefore timeoutno supported cipherSuites not found in observedConfig0 nodes are active; 3 nodes are at revision 05.0.0-0.nightly-2026-06-25-122140Summary by CodeRabbit
Bug Fixes
Tests