[WIP] Add taints package#27
Conversation
This package provides shared taint utilities for remediation operators: - TaintExists: Check if a taint exists in a node's taint list - DeleteTaint: Remove taints by key and effect - CreateOutOfServiceTaint: Create out-of-service taint for node shutdown - InitOutOfServiceTaintFlagsWithRetry: Initialize out-of-service taint support flags based on K8s version Includes version detection for: - Out-of-service taint support (K8s 1.26+) - Out-of-service taint GA (K8s 1.28+) Tests cover version parsing with various formats and edge cases. Note: Tests not passing yet due to k8s version incompatibility. wait.PollUntilContextTimeout not available in k8s.io/apimachinery v0.26.3. Options to fix: 1. Use wait.PollWithContext instead (compatible with v0.26.3) 2. Upgrade common to k8s.io/apimachinery v0.33.x to match FAR/SNR
Upgraded k8s dependencies to support modern APIs: - k8s.io/apimachinery v0.26.3 -> v0.33.7 - k8s.io/api v0.26.3 -> v0.33.7 - k8s.io/client-go v0.26.3 -> v0.33.7 - sigs.k8s.io/controller-runtime v0.14.5 -> v0.21.0 This upgrade enables use of wait.PollUntilContextTimeout in taints package and aligns with FAR and SNR k8s versions. Tests status: - pkg/taints: PASS (11/11 tests) - pkg/etcd: PASS (7/7 tests) - pkg/events: PASS (9/9 tests) - pkg/lease: FAIL (10/11 tests) - 1 failing test due to fake client behavior change Known issue: lease test expects Kind field populated by fake client The lease test failure is unrelated to the taints package. Will fix in the next commit.
Remove Kind/APIVersion assertions on Lease objects in testCreateLease. The new fake client (k8s v0.33) doesn't auto-populate TypeMeta fields like the old one (v0.26) did (see kubernetes-sigs/controller-runtime#3229). These fields are not used by the manager logic and are not checked by compareLeases(). The assertions were testing fake client behavior rather than manager functionality. All lease tests now pass (11/11).
Clearer name for a pure function that filters a taint slice. Does not modify the original slice or delete from a node. Each operator can use this function differently based on their needs.
|
Skipping CI for Draft Pull Request. |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughGo toolchain updated to version 1.24.0 with comprehensive dependency upgrades across direct and transitive modules. Taint management API refactored with renamed functions and new capabilities for out-of-service taint detection. Test suite added for taint utilities with version-aware feature gating logic. Minor test assertion cleanup in lease manager. ChangesTaints API Refactoring
Go Dependency Upgrade
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
pkg/taints/taints.go (3)
38-60: Nit: inconsistentMatchTaintreceiver direction betweenTaintExistsandFilterOutTaint.
TaintExistscallstaint.MatchTaint(taintToFind)(receiver = list element, arg = needle), whileFilterOutTaintcallstaintToDelete.MatchTaint(&taints[i])(receiver = needle, arg = list element). The behavior is equivalent becauseMatchTaintcomparesKey+Effectsymmetrically, but aligning them makes the code easier to scan.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/taints/taints.go` around lines 38 - 60, The two functions TaintExists and FilterOutTaint use MatchTaint with different receiver/argument order which is inconsistent; change FilterOutTaint to call the same direction as TaintExists (use taints[i].MatchTaint(taintToDelete) instead of taintToDelete.MatchTaint(&taints[i])) so both use the list element as the receiver and the needle as the argument, keeping the comparison logic identical and improving readability.
28-35: Recommended: add unit tests forTaintExists,FilterOutTaint, andCreateOutOfServiceTaint.Only
setOutOfTaintFlagsis currently covered bytaints_test.go. Since this package is meant to be the single shared implementation reused by FAR and SNR (per PR objectives), the public surface — taint matching, slice filtering with the deleted-flag return, and the constructedOutOfServicetaint shape — is exactly what consumers will rely on and what should be locked down by tests. Recommend table-driven cases covering: matching/non-matching keys, matching/non-matching effects, empty input slice, and the no-op delete (deleted == false) path.Want me to draft the additional test cases?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/taints/taints.go` around lines 28 - 35, Add unit tests for the public functions TaintExists, FilterOutTaint, and CreateOutOfServiceTaint: write table-driven tests that exercise matching and non-matching key scenarios and matching/non-matching Effect values for TaintExists (use several taint slices and assert boolean result), tests for FilterOutTaint that cover empty input slice, removal of a matching taint (assert returned slice and deleted==true), and the no-op delete path where no taint matches (assert deleted==false and original slice unchanged), and tests for CreateOutOfServiceTaint that assert the returned taint has the expected Key, Value, Effect and contains the intended leadingDigits behavior if relevant; reference functions TaintExists, FilterOutTaint, CreateOutOfServiceTaint and ensure edge cases (nil/empty slices, different Effect enums) are covered in table cases.
91-107: Optional: flattenif/else if/elsechain into early returns for readability.The nested
if/else if/elsewith multiple shadowederrdeclarations is harder to read than the equivalent early-return form, and avoids re-bindingerrinside if-init scopes.♻️ Suggested refactor
func initOutOfServiceTaintFlags(config *rest.Config) error { - if cs, err := kubernetes.NewForConfig(config); err != nil || cs == nil { - if cs == nil { - err = fmt.Errorf("k8s client set is nil") - } - loggerTaint.Error(err, "couldn't retrieve k8s client") - return err - } else if k8sVersion, err := cs.Discovery().ServerVersion(); err != nil || k8sVersion == nil { - if k8sVersion == nil { - err = fmt.Errorf("k8s server version is nil") - } - loggerTaint.Error(err, "couldn't retrieve k8s server version") - return err - } else { - return setOutOfTaintFlags(k8sVersion) - } + cs, err := kubernetes.NewForConfig(config) + if err != nil || cs == nil { + if err == nil { + err = fmt.Errorf("k8s client set is nil") + } + loggerTaint.Error(err, "couldn't retrieve k8s client") + return err + } + k8sVersion, err := cs.Discovery().ServerVersion() + if err != nil || k8sVersion == nil { + if err == nil { + err = fmt.Errorf("k8s server version is nil") + } + loggerTaint.Error(err, "couldn't retrieve k8s server version") + return err + } + return setOutOfTaintFlags(k8sVersion) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/taints/taints.go` around lines 91 - 107, The initOutOfServiceTaintFlags function uses a nested if/else-if/else chain with shadowed err variables; refactor it to use early returns to improve readability: first call kubernetes.NewForConfig(config) into cs, err, check if err != nil or cs == nil and log via loggerTaint.Error with a clear message (set err = fmt.Errorf("k8s client set is nil") if cs == nil) then return err; next call cs.Discovery().ServerVersion() into k8sVersion, err, check if err != nil or k8sVersion == nil and log similarly (set err = fmt.Errorf("k8s server version is nil") if nil) then return err; finally call and return setOutOfTaintFlags(k8sVersion). Ensure you do not rebind err in nested if-inits so the flow is linear and easier to follow.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/taints/taints.go`:
- Around line 80-89: Update the stale comment to reference the unexported helper
name initOutOfServiceTaintFlags (instead of InitOutOfServiceTaintFlags) and add
a short inline justification next to the PollUntilContextTimeout callback (or a
//nolint:nilerr directive) clarifying that returning (false, nil) is intentional
to keep retrying and that the outer err variable preserves the last failure;
this will fix searchability and silence the nilerr linter false positive for the
closure used by PollUntilContextTimeout.
---
Nitpick comments:
In `@pkg/taints/taints.go`:
- Around line 38-60: The two functions TaintExists and FilterOutTaint use
MatchTaint with different receiver/argument order which is inconsistent; change
FilterOutTaint to call the same direction as TaintExists (use
taints[i].MatchTaint(taintToDelete) instead of
taintToDelete.MatchTaint(&taints[i])) so both use the list element as the
receiver and the needle as the argument, keeping the comparison logic identical
and improving readability.
- Around line 28-35: Add unit tests for the public functions TaintExists,
FilterOutTaint, and CreateOutOfServiceTaint: write table-driven tests that
exercise matching and non-matching key scenarios and matching/non-matching
Effect values for TaintExists (use several taint slices and assert boolean
result), tests for FilterOutTaint that cover empty input slice, removal of a
matching taint (assert returned slice and deleted==true), and the no-op delete
path where no taint matches (assert deleted==false and original slice
unchanged), and tests for CreateOutOfServiceTaint that assert the returned taint
has the expected Key, Value, Effect and contains the intended leadingDigits
behavior if relevant; reference functions TaintExists, FilterOutTaint,
CreateOutOfServiceTaint and ensure edge cases (nil/empty slices, different
Effect enums) are covered in table cases.
- Around line 91-107: The initOutOfServiceTaintFlags function uses a nested
if/else-if/else chain with shadowed err variables; refactor it to use early
returns to improve readability: first call kubernetes.NewForConfig(config) into
cs, err, check if err != nil or cs == nil and log via loggerTaint.Error with a
clear message (set err = fmt.Errorf("k8s client set is nil") if cs == nil) then
return err; next call cs.Discovery().ServerVersion() into k8sVersion, err, check
if err != nil or k8sVersion == nil and log similarly (set err = fmt.Errorf("k8s
server version is nil") if nil) then return err; finally call and return
setOutOfTaintFlags(k8sVersion). Ensure you do not rebind err in nested if-inits
so the flow is linear and easier to follow.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: c6bd1bdc-2caa-4e8e-90ee-367a63c023f1
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (4)
go.modpkg/lease/manager_test.gopkg/taints/taints.gopkg/taints/taints_test.go
💤 Files with no reviewable changes (1)
- pkg/lease/manager_test.go
|
we can also use PollWithContext instead of PollUntilContextTimeout to avoid all the k8s upgrades |
razo7
left a comment
There was a problem hiding this comment.
Code Review: Add taints package
Must fix before merge
1. Concurrency hazard — mutable package-level state
IsOutOfServiceTaintSupported and IsOutOfServiceTaintGA are exported mutable package-level vars written by InitOutOfServiceTaintFlagsWithRetry and read by any consumer, with no synchronization. If two controllers in the same process call this, or a reader checks the flag while the retry loop is running, there's a data race.
Use sync.Once or atomic.Bool:
type OutOfServiceTaintInfo struct {
Supported bool
GA bool
}Or at minimum protect with sync.Once since the values are write-once.
2. Missing test coverage
TaintExists, FilterOutTaint, and CreateOutOfServiceTaint are all untested. Since this is a shared library replacing duplicated code from FAR and SNR, comprehensive coverage is essential. Both FAR and SNR presumably had tests for their own implementations — those should migrate here.
3. Nested if/else with shadowed err in initOutOfServiceTaintFlags
The deep if/else if/else chain re-declares err in each if init statement, shadowing the outer variable. While the current code happens to work, it's fragile. Refactor to early returns.
Should fix
4. Discarded PollUntilContextTimeout error
_ = wait.PollUntilContextTimeout(ctx, interval, timeout, true, func(ctx context.Context) (bool, error) {Intentional but undocumented. If the parent context is cancelled, the poll returns ctx.Err(), but you return the last internal err. Add a comment explaining the intent.
5. Client-side TimeAdded in CreateOutOfServiceTaint
Setting TimeAdded at Go function call time, not when the taint is applied to the node. In Kubernetes, TimeAdded is typically set by the API server. Consider removing it and letting the API server handle it.
6. Stale comment
InitOutOfServiceTaintFlagsWithRetry comment says "retry for a limited number of times" but the implementation is time-bound (2s interval, 10s timeout), not count-bound. Update the comment.
7. Inconsistent MatchTaint call direction
TaintExists calls taint.MatchTaint(taintToFind) while FilterOutTaint calls taintToDelete.MatchTaint(&taints[i]). MatchTaint is symmetric on Key+Effect so it works, but the inconsistency hurts readability. Pick one direction.
Notes
- Tests use stdlib
testingwhile the rest of the repo (pkg/etcd,pkg/lease) uses Ginkgo/Gomega — worth aligning for consistency. - The dense test assertion in
setOutOfTaintFlagschecks three conditions in oneifstatement. Breaking into separate assertions would make failures easier to diagnose. - No
suite_test.gofile, unlike other packages in the repo.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/taints/taints.go`:
- Around line 88-104: The current InitOutOfServiceTaintFlagsWithRetry uses
initOnce.Do which permanently prevents retries if the first attempt fails;
change the logic so the "done" marker is only set when
initOutOfServiceTaintFlags succeeds: replace the initOnce.Do block with a
mutex-protected flow (or an atomic boolean) that runs the
wait.PollUntilContextTimeout retry loop and only sets the initialized flag and
initErr on success, leaving initErr unset/updated on transient failures so
subsequent calls to InitOutOfServiceTaintFlagsWithRetry will retry; keep
references to initOutOfServiceTaintFlags, initErr, and initOnce (or rename
initOnce to initializedFlag if you switch types) so the success-only
initialization semantics are clear.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 58a8decc-d29b-4375-bd40-ee84048d196d
📒 Files selected for processing (2)
pkg/taints/taints.gopkg/taints/taints_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
- pkg/taints/taints_test.go
Replace exported mutable vars with OutOfServiceTaintInfo struct accessed via GetOutOfServiceTaintInfo() getter. Update tests to use struct and follow common repo naming (expectedInfo, expectedErr).
ca54e7b to
f41128c
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/taints/taints.go`:
- Around line 79-80: GetOutOfServiceTaintInfo reads the shared package variable
taintInfo while setOutOfTaintFlags mutates it, causing potential data races; add
a package-level sync.RWMutex (e.g. taintInfoMu) and use
taintInfoMu.RLock()/RUnlock() in GetOutOfServiceTaintInfo and
taintInfoMu.Lock()/Unlock() in setOutOfTaintFlags to synchronize access
(returning the struct copy under the read lock). Ensure all other accesses to
taintInfo are similarly guarded.
- Around line 91-97: Assign and check the error returned by
wait.PollUntilContextTimeout instead of discarding it: capture its return value
(e.g. pollErr := wait.PollUntilContextTimeout(...)), and return that error when
non-nil; if initOutOfServiceTaintFlags left a non-nil initialization error in
the outer err variable, wrap or combine both (e.g. include init error context)
before returning. Ensure you don’t shadow err inside the polling callback and
reference the existing initOutOfServiceTaintFlags and
wait.PollUntilContextTimeout symbols when making the change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 544e21ca-fab7-4cec-87d8-72fa650ac40b
📒 Files selected for processing (2)
pkg/taints/taints.gopkg/taints/taints_test.go
✅ Files skipped from review due to trivial changes (1)
- pkg/taints/taints_test.go
Both FAR and SNR call it once in main.go at startup; Single-threaded initialization - no concurrent calls (and, each operator runs in its own pod/process with separate memory - FAR and SNR don't share package-level variables) |
Convert tests to Ginkgo/Gomega for consistency with other packages in common (pkg/etcd, pkg/lease). Add test coverage for TaintExists, FilterOutTaint, and CreateOutOfServiceTaint which were previously untested. Total test coverage: 22 test cases covering all public functions.
Add envtest setup to Makefile following FAR/SNR pattern. The test target now depends on envtest and sets KUBEBUILDER_ASSETS to enable integration testing. Add integration tests that verify InitOutOfServiceTaintFlagsWithRetry works with a real Kubernetes API server, testing both success and error cases. All tests pass (25 total: 22 unit + 3 integration).
Simplify error handling to avoid err shadowing.
Check poll error for context cancellation and return it so caller knows the operation was cancelled. On timeout, return the internal error which is more specific than the generic timeout error. While FAR and SNR currently use context.Background() (which never cancels), as a library we should not rely on that assumption.
TimeAdded should be set when the taint is applied to the node, not when the struct is created. Setting it at creation time results in inaccurate timestamps if there's any delay before patching the node. Both FAR and SNR set TimeAdded at patch time, making the creation-time value incorrect and misleading. Callers should set TimeAdded when they apply the taint.
Adds two new functions for node taint management: - AppendTaintToNode: adds taint to node and patches it, sets TimeAdded - RemoveTaintFromNode: removes taint from node and patches it Both use MergeFrom patch strategy to avoid update conflicts. Includes integration tests using envtest.
There was a problem hiding this comment.
Actionable comments posted: 7
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@go.mod`:
- Line 43: The project currently pulls vulnerable github.com/moby/spdystream
v0.5.0 transitively; update the dependency graph by bumping
sigs.k8s.io/controller-runtime to a release that depends on k8s.io/client-go
v0.36.0+ (e.g., set sigs.k8s.io/controller-runtime to v0.17.0 or later in
go.mod), then run go get/sync (e.g., `go get
sigs.k8s.io/controller-runtime@v0.17.0`), followed by go mod tidy and verify
go.sum to ensure spdystream is now v0.5.1; alternatively, if you choose not to
upgrade, add a note and tests proving the spdystream code path is unreachable.
In `@Makefile`:
- Around line 70-79: The Makefile installs tools into a hardcoded
$(PROJECT_DIR)/bin while targets use the configurable LOCALBIN, causing
mismatched check vs install locations; update the tool install invocations so
they install into the same path referenced by the targets (use the per-tool path
variables SORT_IMPORTS and ENVTEST or LOCALBIN) by changing the go-install-tool
calls in the sort-imports and envtest targets to install to $(SORT_IMPORTS) and
$(ENVTEST) (or otherwise ensure the go-install-tool macro uses $(LOCALBIN)
instead of $(PROJECT_DIR)/bin) so checks and installs remain consistent.
- Around line 43-45: The check target uses a dockerized golang:1.20 image while
go.mod declares Go 1.24.0; update the Makefile's GO_VERSION variable (and any
references that construct the docker image tag) from 1.20 to 1.24.0 so the
`check` target (and the container run that executes `make lint test`) uses
golang:1.24.0; ensure any other occurrences of GO_VERSION in the Makefile refer
to the same updated variable so the test container and tooling match the module
toolchain.
In `@pkg/taints/taints_integration_test.go`:
- Around line 44-68: The tests currently call
InitOutOfServiceTaintFlagsWithRetry and then discard the result of
GetOutOfServiceTaintInfo(), which means regressions in taint detection can slip
by; update both specs so after calling GetOutOfServiceTaintInfo() you assert the
returned struct's key flags (e.g., Supported and GA) are true (since envtest is
pinned to k8s v1.33) and also assert the struct is not the zero value; reference
the functions InitOutOfServiceTaintFlagsWithRetry and GetOutOfServiceTaintInfo
to locate the assertions to tighten.
In `@pkg/taints/taints_test.go`:
- Around line 91-97: The test for CreateOutOfServiceTaint is missing an
assertion that the returned taint's TimeAdded is unset; update the spec in
Context("CreateOutOfServiceTaint") / It("should create out-of-service taint with
correct fields") to assert that taint.TimeAdded is nil (or zero/unset per the
type used) after calling CreateOutOfServiceTaint(), keeping the other Expect
checks for Key, Value, and Effect intact so regressions setting TimeAdded will
fail the test.
In `@pkg/taints/taints.go`:
- Around line 81-111: AppendTaintToNode and RemoveTaintFromNode currently build
merge patches from a potentially stale node and can clobber concurrent taint
updates; fix both by re-fetching the latest Node before mutating, capture a deep
copy as "stored" before changing node.Spec.Taints, then call client.Patch with
client.MergeFromWithOptions(stored, client.MergeFromWithOptimisticLock{}) so the
patch includes metadata.resourceVersion and will fail on concurrent
modifications; if Patch returns a conflict error, return that error so the
caller/controller can requeue and reconcile with a fresh fetch.
- Around line 87-89: The function that appends taints currently sets
taint.TimeAdded unconditionally; change AppendTaintToNode (or the code block
that does now := metav1.Now(); taint.TimeAdded = &now; node.Spec.Taints =
append(...)) to only set TimeAdded when taint.Effect ==
corev1.TaintEffectNoExecute, leaving TimeAdded nil for NoSchedule and
PreferNoSchedule; locate the taint append logic in pkg/taints/taints.go and wrap
the timestamp assignment in an if checking corev1.TaintEffectNoExecute, then
append the taint as before and adjust related tests that currently expect
TimeAdded for NoSchedule.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 90bff7ce-2a72-4752-b1f4-5566a0b515d5
📒 Files selected for processing (6)
Makefilego.modpkg/taints/taints.gopkg/taints/taints_integration_test.gopkg/taints/taints_suite_test.gopkg/taints/taints_test.go
| github.com/matttproud/golang_protobuf_extensions v1.0.2 // indirect | ||
| github.com/moby/spdystream v0.2.0 // indirect | ||
| github.com/mailru/easyjson v0.7.7 // indirect | ||
| github.com/moby/spdystream v0.5.0 // indirect |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
Is github.com/moby/spdystream v0.5.0 affected by GHSA-pc3f-x583-g7j2, and which k8s.io/client-go or sigs.k8s.io/controller-runtime release removes or patches that transitive dependency?
💡 Result:
Yes, github.com/moby/spdystream v0.5.0 is affected by GHSA-pc3f-x583-g7j2, as affected versions are <= v0.5.0 and v0.5.1 patches it. k8s.io/client-go versions prior to v0.36.0 depend on spdystream v0.5.0 transitively (via k8s.io/apimachinery/pkg/util/httpstream/spdy or k8s.io/streaming), so they are affected. k8s.io/client-go v0.36.0 uses spdystream v0.5.1. sigs.k8s.io/controller-runtime depends on k8s.io/client-go; versions aligned with client-go v0.36.0+ (likely v0.17.0+, check compatibility table) will patch it transitively. No direct spdystream dep in controller-runtime go.mod; fix comes via client-go upgrade.
Citations:
- 1: GHSA-pc3f-x583-g7j2
- 2: https://github.com/moby/spdystream/releases/tag/v0.5.1
- 3: https://github.com/kubernetes/client-go/network/dependencies
- 4: kubernetes/client-go@v0.35.4...v0.36.0
- 5: https://proxy.golang.org/k8s.io/client-go/@v/v0.35.2.mod
- 6: kubernetes/apimachinery@d966e56
- 7: kubernetes/client-go@1d95f02
- 8: Update github.com/moby/spdystream from v0.5.0 to v0.5.1 kubernetes/kubernetes#138354
- 9: https://pkg.go.dev/k8s.io/client-go@v0.36.0
Address known-vulnerable moby/spdystream v0.5.0 in the transitive dependency graph.
This version is affected by GHSA-pc3f-x583-g7j2 (DoS advisory). The fix is available in spdystream v0.5.1, which is included in k8s.io/client-go v0.36.0+. Upgrade sigs.k8s.io/controller-runtime to a version aligned with k8s.io/client-go v0.36.0 or later (likely v0.17.0+) to transitively pull the patched version. Alternatively, verify that the affected code path in spdystream is unreachable in your application before merge.
🧰 Tools
🪛 OSV Scanner (2.3.5)
[HIGH] 43-43: github.com/moby/spdystream 0.5.0: SpdyStream: DOS on CRI
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@go.mod` at line 43, The project currently pulls vulnerable
github.com/moby/spdystream v0.5.0 transitively; update the dependency graph by
bumping sigs.k8s.io/controller-runtime to a release that depends on
k8s.io/client-go v0.36.0+ (e.g., set sigs.k8s.io/controller-runtime to v0.17.0
or later in go.mod), then run go get/sync (e.g., `go get
sigs.k8s.io/controller-runtime@v0.17.0`), followed by go mod tidy and verify
go.sum to ensure spdystream is now v0.5.1; alternatively, if you choose not to
upgrade, add a note and tests proving the spdystream code path is unreachable.
| test: envtest ## Run tests. | ||
| KUBEBUILDER_ASSETS="$(shell $(ENVTEST) use $(ENVTEST_K8S_VERSION) --bin-dir $(LOCALBIN)/k8s/$(ENVTEST_K8S_VERSION) -p path)" \ | ||
| go test ./... -coverprofile cover.out -v |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Find and read the go.mod file to check the Go version requirement
fd -t f 'go.mod$' -x cat {}
echo "---"
# Find and read the Makefile to check GO_VERSION and the check/test targets
fd -t f 'Makefile$' -x head -100 {}Repository: medik8s/common
Length of output: 6694
make check runs tests in Go 1.20 but go.mod requires Go 1.24.0.
The dockerized check target runs make lint test inside a golang:1.20 container, which will fail if the code uses Go 1.24.0 features. Update GO_VERSION to match the module's declared toolchain.
🛠️ Fix
-GO_VERSION = 1.20
+GO_VERSION = 1.24.0🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@Makefile` around lines 43 - 45, The check target uses a dockerized
golang:1.20 image while go.mod declares Go 1.24.0; update the Makefile's
GO_VERSION variable (and any references that construct the docker image tag)
from 1.20 to 1.24.0 so the `check` target (and the container run that executes
`make lint test`) uses golang:1.24.0; ensure any other occurrences of GO_VERSION
in the Makefile refer to the same updated variable so the test container and
tooling match the module toolchain.
| It("should successfully detect k8s version and set taint info", func() { | ||
| ctx := context.Background() | ||
| err := InitOutOfServiceTaintFlagsWithRetry(ctx, cfg) | ||
| Expect(err).NotTo(HaveOccurred()) | ||
|
|
||
| // envtest typically runs a recent k8s version (>= 1.26) | ||
| // so we expect out-of-service taint to be supported | ||
| info := GetOutOfServiceTaintInfo() | ||
| // We can't assert exact values since envtest version varies | ||
| // but we can verify the function executed without error | ||
| _ = info | ||
| }) | ||
|
|
||
| It("should populate taintInfo correctly", func() { | ||
| ctx := context.Background() | ||
| err := InitOutOfServiceTaintFlagsWithRetry(ctx, cfg) | ||
| Expect(err).NotTo(HaveOccurred()) | ||
|
|
||
| // Verify taintInfo was populated (not zero value) | ||
| info := GetOutOfServiceTaintInfo() | ||
| // Since we can't know the exact envtest version, just verify | ||
| // the function completed and populated something | ||
| By("verifying taintInfo was populated") | ||
| _ = info | ||
| }) |
There was a problem hiding this comment.
These specs don't currently validate the flag values.
Both cases only assert "no error" and then discard info, so a regression in GetOutOfServiceTaintInfo() would still pass. Since this PR’s Makefile pins envtest to Kubernetes 1.33, these tests can assert Supported == true and GA == true directly.
🔍 Tighten the assertions
- info := GetOutOfServiceTaintInfo()
- // We can't assert exact values since envtest version varies
- // but we can verify the function executed without error
- _ = info
+ info := GetOutOfServiceTaintInfo()
+ Expect(info.Supported).To(BeTrue())
+ Expect(info.GA).To(BeTrue())
@@
- info := GetOutOfServiceTaintInfo()
- // Since we can't know the exact envtest version, just verify
- // the function completed and populated something
- By("verifying taintInfo was populated")
- _ = info
+ info := GetOutOfServiceTaintInfo()
+ By("verifying taintInfo was populated")
+ Expect(info.Supported).To(BeTrue())
+ Expect(info.GA).To(BeTrue())📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| It("should successfully detect k8s version and set taint info", func() { | |
| ctx := context.Background() | |
| err := InitOutOfServiceTaintFlagsWithRetry(ctx, cfg) | |
| Expect(err).NotTo(HaveOccurred()) | |
| // envtest typically runs a recent k8s version (>= 1.26) | |
| // so we expect out-of-service taint to be supported | |
| info := GetOutOfServiceTaintInfo() | |
| // We can't assert exact values since envtest version varies | |
| // but we can verify the function executed without error | |
| _ = info | |
| }) | |
| It("should populate taintInfo correctly", func() { | |
| ctx := context.Background() | |
| err := InitOutOfServiceTaintFlagsWithRetry(ctx, cfg) | |
| Expect(err).NotTo(HaveOccurred()) | |
| // Verify taintInfo was populated (not zero value) | |
| info := GetOutOfServiceTaintInfo() | |
| // Since we can't know the exact envtest version, just verify | |
| // the function completed and populated something | |
| By("verifying taintInfo was populated") | |
| _ = info | |
| }) | |
| It("should successfully detect k8s version and set taint info", func() { | |
| ctx := context.Background() | |
| err := InitOutOfServiceTaintFlagsWithRetry(ctx, cfg) | |
| Expect(err).NotTo(HaveOccurred()) | |
| // envtest typically runs a recent k8s version (>= 1.26) | |
| // so we expect out-of-service taint to be supported | |
| info := GetOutOfServiceTaintInfo() | |
| Expect(info.Supported).To(BeTrue()) | |
| Expect(info.GA).To(BeTrue()) | |
| }) | |
| It("should populate taintInfo correctly", func() { | |
| ctx := context.Background() | |
| err := InitOutOfServiceTaintFlagsWithRetry(ctx, cfg) | |
| Expect(err).NotTo(HaveOccurred()) | |
| // Verify taintInfo was populated (not zero value) | |
| info := GetOutOfServiceTaintInfo() | |
| By("verifying taintInfo was populated") | |
| Expect(info.Supported).To(BeTrue()) | |
| Expect(info.GA).To(BeTrue()) | |
| }) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/taints/taints_integration_test.go` around lines 44 - 68, The tests
currently call InitOutOfServiceTaintFlagsWithRetry and then discard the result
of GetOutOfServiceTaintInfo(), which means regressions in taint detection can
slip by; update both specs so after calling GetOutOfServiceTaintInfo() you
assert the returned struct's key flags (e.g., Supported and GA) are true (since
envtest is pinned to k8s v1.33) and also assert the struct is not the zero
value; reference the functions InitOutOfServiceTaintFlagsWithRetry and
GetOutOfServiceTaintInfo to locate the assertions to tighten.
| Context("CreateOutOfServiceTaint", func() { | ||
| It("should create out-of-service taint with correct fields", func() { | ||
| taint := CreateOutOfServiceTaint() | ||
| Expect(taint.Key).To(Equal(corev1.TaintNodeOutOfService)) | ||
| Expect(taint.Value).To(Equal("nodeshutdown")) | ||
| Expect(taint.Effect).To(Equal(corev1.TaintEffectNoExecute)) | ||
| }) |
There was a problem hiding this comment.
Assert that TimeAdded stays unset here.
One of the new contract changes is that CreateOutOfServiceTaint() does not populate TimeAdded, but this test never checks that. A regression there would pass unnoticed.
✅ Add the missing assertion
taint := CreateOutOfServiceTaint()
Expect(taint.Key).To(Equal(corev1.TaintNodeOutOfService))
Expect(taint.Value).To(Equal("nodeshutdown"))
Expect(taint.Effect).To(Equal(corev1.TaintEffectNoExecute))
+ Expect(taint.TimeAdded).To(BeNil())
})
})📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| Context("CreateOutOfServiceTaint", func() { | |
| It("should create out-of-service taint with correct fields", func() { | |
| taint := CreateOutOfServiceTaint() | |
| Expect(taint.Key).To(Equal(corev1.TaintNodeOutOfService)) | |
| Expect(taint.Value).To(Equal("nodeshutdown")) | |
| Expect(taint.Effect).To(Equal(corev1.TaintEffectNoExecute)) | |
| }) | |
| Context("CreateOutOfServiceTaint", func() { | |
| It("should create out-of-service taint with correct fields", func() { | |
| taint := CreateOutOfServiceTaint() | |
| Expect(taint.Key).To(Equal(corev1.TaintNodeOutOfService)) | |
| Expect(taint.Value).To(Equal("nodeshutdown")) | |
| Expect(taint.Effect).To(Equal(corev1.TaintEffectNoExecute)) | |
| Expect(taint.TimeAdded).To(BeNil()) | |
| }) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/taints/taints_test.go` around lines 91 - 97, The test for
CreateOutOfServiceTaint is missing an assertion that the returned taint's
TimeAdded is unset; update the spec in Context("CreateOutOfServiceTaint") /
It("should create out-of-service taint with correct fields") to assert that
taint.TimeAdded is nil (or zero/unset per the type used) after calling
CreateOutOfServiceTaint(), keeping the other Expect checks for Key, Value, and
Effect intact so regressions setting TimeAdded will fail the test.
| func AppendTaintToNode(ctx context.Context, c client.Client, node *corev1.Node, taint corev1.Taint) (bool, error) { | ||
| if TaintExists(node.Spec.Taints, &taint) { | ||
| return false, nil | ||
| } | ||
|
|
||
| patch := client.MergeFrom(node.DeepCopy()) | ||
| now := metav1.Now() | ||
| taint.TimeAdded = &now | ||
| node.Spec.Taints = append(node.Spec.Taints, taint) | ||
|
|
||
| if err := c.Patch(ctx, node, patch); err != nil { | ||
| return false, err | ||
| } | ||
| return true, nil | ||
| } | ||
|
|
||
| // RemoveTaintFromNode removes the given taint from the node and patches it. | ||
| // Returns true if the taint was removed, false if it didn't exist. | ||
| func RemoveTaintFromNode(ctx context.Context, c client.Client, node *corev1.Node, taint corev1.Taint) (bool, error) { | ||
| newTaints, removed := FilterOutTaint(node.Spec.Taints, &taint) | ||
| if !removed { | ||
| return false, nil | ||
| } | ||
|
|
||
| patch := client.MergeFrom(node.DeepCopy()) | ||
| node.Spec.Taints = newTaints | ||
|
|
||
| if err := c.Patch(ctx, node, patch); err != nil { | ||
| return false, err | ||
| } | ||
| return true, nil |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
In sigs.k8s.io/controller-runtime v0.21.0, what is the recommended way to patch corev1.Node.Spec.Taints without dropping concurrent taint updates? Does client.MergeFromWithOptimisticLock exist in this version?
💡 Result:
Recommended way: use MergeFrom (or StrategicMergeFrom) with optimistic locking so that the patch includes metadata.resourceVersion and will conflict/retry if another actor updated the Node (avoids clobbering concurrent taint changes). Concretely, in controller-runtime v0.21.0 patch the Node object using client.MergeFromWithOptions(stored, client.MergeFromWithOptimisticLock{}), where stored is node.DeepCopy taken immediately before you mutate node.Spec.Taints. Example: stored := node.DeepCopy // capture current state // ... add/remove taints on node.Spec.Taints ... if !equality.Semantic.DeepEqual(stored, node) { err := kubeClient.Patch(ctx, node, client.MergeFromWithOptions(stored, client.MergeFromWithOptimisticLock{})) // if err is a conflict, requeue/retry with fresh state } Does client.MergeFromWithOptimisticLock exist in this version? Yes. controller-runtime exposes client.MergeFromWithOptimisticLock (used via MergeFromWithOptions to set OptimisticLock=true for merge-from patches).
Citations:
- 1: Race condition in taint patching can clobber concurrent taint modifications kubernetes-sigs/node-readiness-controller#179
- 2: https://github.com/kubernetes-sigs/controller-runtime/blob/main/pkg/client/patch.go
- 3: ✨ Add Patch MergeFrom optimistic locking option kubernetes-sigs/controller-runtime#969
These helpers can clobber concurrent taint updates.
Both functions build a merge patch from a potentially stale Node object. If another actor modifies spec.taints between when the caller fetches the node and when the patch is applied, the whole slice can be overwritten, silently dropping the concurrent change.
Instead, re-fetch the latest Node inside the helper and patch with optimistic locking. Use client.MergeFromWithOptions(stored, client.MergeFromWithOptimisticLock{}), where stored is a deep copy captured before mutating node.Spec.Taints. On conflict errors, requeue with a fresh fetch. This ensures the patch includes metadata.resourceVersion and will fail if the node was modified concurrently, preventing silent data loss.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/taints/taints.go` around lines 81 - 111, AppendTaintToNode and
RemoveTaintFromNode currently build merge patches from a potentially stale node
and can clobber concurrent taint updates; fix both by re-fetching the latest
Node before mutating, capture a deep copy as "stored" before changing
node.Spec.Taints, then call client.Patch with
client.MergeFromWithOptions(stored, client.MergeFromWithOptimisticLock{}) so the
patch includes metadata.resourceVersion and will fail on concurrent
modifications; if Patch returns a conflict error, return that error so the
caller/controller can requeue and reconcile with a fresh fetch.
| now := metav1.Now() | ||
| taint.TimeAdded = &now | ||
| node.Spec.Taints = append(node.Spec.Taints, taint) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's examine the taints.go file to see the full context
cat -n pkg/taints/taints.go | head -100Repository: medik8s/common
Length of output: 3848
🏁 Script executed:
# Look for the AppendTaintToNode function
rg -A 20 "func.*AppendTaintToNode" pkg/taints/Repository: medik8s/common
Length of output: 1262
🏁 Script executed:
# Search for TimeAdded usage in the codebase
rg "TimeAdded" --type-listRepository: medik8s/common
Length of output: 448
🏁 Script executed:
# Search for integration tests related to taints
fd -e "\.go$" -i test | xargs rg -l "TimeAdded|AppendTaint" 2>/dev/null | head -20Repository: medik8s/common
Length of output: 117
🏁 Script executed:
# Look for documentation or comments about taint effects
rg "TaintEffect" -B 2 -A 2 --type go | head -50Repository: medik8s/common
Length of output: 3067
🏁 Script executed:
cat -n pkg/taints/taints_integration_test.goRepository: medik8s/common
Length of output: 7828
🏁 Script executed:
# Search for TimeAdded assertions in tests
rg "TimeAdded" pkg/taints/Repository: medik8s/common
Length of output: 461
🌐 Web query:
Kubernetes taint TimeAdded NoExecute semantics
💡 Result:
In Kubernetes, the TimeAdded field in a taint specifies the timestamp when the taint was added to the node. This is particularly relevant for taints with the NoExecute effect. For NoExecute taints: - Pods without a matching toleration are evicted immediately upon taint addition. - Pods with a matching toleration but no tolerationSeconds remain bound indefinitely. - Pods with a matching toleration specifying tolerationSeconds remain bound for that duration after the TimeAdded timestamp. The node lifecycle controller evicts them after this grace period elapses, unless the taint is removed first. The eviction timer starts from the TimeAdded time. Kubernetes automatically adds TimeAdded when the node controller or kubelet applies taints like node.kubernetes.io/not-ready or node.kubernetes.io/unreachable with NoExecute effect (e.g., when a node becomes NotReady or unreachable). Example taint with TimeAdded (from node status): - effect: NoExecute key: node.kubernetes.io/unreachable timeAdded: "2023-09-21T06:32:46Z" Pods tolerating this with tolerationSeconds: 300 (default for auto-added tolerations) stay for 300 seconds post-TimeAdded. This behavior is unchanged in recent versions (docs updated 2026), with minor implementation details like occasional 5s delays in taint addition observed in issues but not affecting core semantics.
Citations:
- 1: http://k8s.io/docs/concepts/scheduling-eviction/taint-and-toleration/
- 2: https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/
- 3: NoExecute taint is added with extra 5s delay when a Node's ready condition becomes Unknown kubernetes/kubernetes#120815
- 4: Controller doesn't taint the node as NoExecute when in NotReady state kubernetes/kubernetes#77981
- 5: https://kubernetes.io/docs/reference/labels-annotations-taints
🏁 Script executed:
# Search for how TimeAdded is used in Kubernetes and related code
rg "TimeAdded" -B 3 -A 3 --type goRepository: medik8s/common
Length of output: 2377
🏁 Script executed:
# Look for any comments or documentation about taint effects
rg "TaintEffect|NoSchedule|NoExecute|PreferNoSchedule" -B 2 -A 2 --type go | head -100Repository: medik8s/common
Length of output: 3446
Only set TimeAdded for NoExecute taints.
corev1.Taint.TimeAdded is only meaningful for NoExecute effects. Kubernetes uses it to calculate the grace period for pod eviction based on tolerationSeconds. Setting it unconditionally makes AppendTaintToNode semantically incorrect for NoSchedule and PreferNoSchedule taints, and the integration test currently asserts incorrect behavior by expecting TimeAdded to be set on a NoSchedule taint.
🎯 Narrow the timestamping to the supported effect
now := metav1.Now()
- taint.TimeAdded = &now
+ if taint.Effect == corev1.TaintEffectNoExecute {
+ taint.TimeAdded = &now
+ }
node.Spec.Taints = append(node.Spec.Taints, taint)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| now := metav1.Now() | |
| taint.TimeAdded = &now | |
| node.Spec.Taints = append(node.Spec.Taints, taint) | |
| now := metav1.Now() | |
| if taint.Effect == corev1.TaintEffectNoExecute { | |
| taint.TimeAdded = &now | |
| } | |
| node.Spec.Taints = append(node.Spec.Taints, taint) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/taints/taints.go` around lines 87 - 89, The function that appends taints
currently sets taint.TimeAdded unconditionally; change AppendTaintToNode (or the
code block that does now := metav1.Now(); taint.TimeAdded = &now;
node.Spec.Taints = append(...)) to only set TimeAdded when taint.Effect ==
corev1.TaintEffectNoExecute, leaving TimeAdded nil for NoSchedule and
PreferNoSchedule; locate the taint append logic in pkg/taints/taints.go and wrap
the timestamp assignment in an if checking corev1.TaintEffectNoExecute, then
append the taint as before and adjust related tests that currently expect
TimeAdded for NoSchedule.
- Rename taintInfo variable to OutOfServiceInfo for better clarity - Remove GetOutOfServiceTaintInfo() getter function since variable is now exported - Update all references in tests to use OutOfServiceInfo directly - This allows consumers to access and mock the variable for testing
|
please add .DS_Store to your global |
f0384dc to
f250d22
Compare
slintes
left a comment
There was a problem hiding this comment.
I know this probably is a lot of copy paste of existing code. But this is the chance to improve things :)
Main issue is the use a Init(...) function and storing the result in a global public var. Please rethink this.
Some more comments inline.
- Rename FilterOutTaint to Filter (package name makes 'Out' redundant) - Use slices.DeleteFunc for cleaner implementation - Update documentation to explain Kubernetes taint identity semantics - Update all tests to use new function name
- Rename InitOutOfServiceTaintFlagsWithRetry to DetectOutOfServiceTaintInfoWithRetry - Function now returns OutOfServiceTaintInfo instead of storing in global variable - Rename initOutOfServiceTaintFlags to detectOutOfServiceTaintInfo - Rename setOutOfTaintFlags to getOutOfServiceTaintInfo - Update tests to check returned values instead of global variable
slintes
left a comment
There was a problem hiding this comment.
looks good, will add a hold in case someone else want to do a review
/hold
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
pkg/taints/taints_test.go (1)
96-103:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winStill missing the
TimeAddednil assertion onCreateOutOfServiceTaint.The contract is that
CreateOutOfServiceTaint()does not setTimeAdded(the caller does, inAddTaintToNode). Without an explicit assertion, a regression that pre-populatesTimeAddedhere would slip through.✅ Add the missing assertion
taint := CreateOutOfServiceTaint() Expect(taint.Key).To(Equal(corev1.TaintNodeOutOfService)) Expect(taint.Value).To(Equal("nodeshutdown")) Expect(taint.Effect).To(Equal(corev1.TaintEffectNoExecute)) + Expect(taint.TimeAdded).To(BeNil()) })🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/taints/taints_test.go` around lines 96 - 103, The test for CreateOutOfServiceTaint is missing an assertion that TimeAdded is nil; update the spec in Context("CreateOutOfServiceTaint")/It(...) to assert that the returned taint from CreateOutOfServiceTaint() has a nil TimeAdded (i.e., Expect(taint.TimeAdded).To(BeNil())), so the test enforces the contract that CreateOutOfServiceTaint does not pre-populate TimeAdded.go.mod (1)
42-42:⚠️ Potential issue | 🟠 Major | 💤 Low value
moby/spdystreamv0.5.0 (GHSA-pc3f-x583-g7j2) still in transitive graph.This is unchanged from the previous review —
client-gov0.33.7 still pulls v0.5.0 transitively, and the fix lands only inclient-gov0.36.0+. Since the explicit goal here is alignment with FAR/SNR on v0.33.7, a transitive upgrade isn't on the table for this PR; consider adding a//indirectcomment or a SECURITY note documenting that the affected SPDY code path is unreachable in this codebase, and/or tracking a follow-up issue to re-evaluate once FAR/SNR move past v0.33.x.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@go.mod` at line 42, The transitive dependency github.com/moby/spdystream v0.5.0 remains in the go.mod via client-go v0.33.7; update the PR by annotating that entry (github.com/moby/spdystream v0.5.0) with a comment documenting the GHSA and explaining that the vulnerable SPDY code path is unreachable in this codebase (or add a SECURITY.md note), and create or reference a follow-up issue to re-evaluate/upgrade once client-go moves to v0.36.0+ to eliminate the transitive vulnerability.
🧹 Nitpick comments (1)
pkg/taints/taints.go (1)
117-122: 💤 Low valueOptional: silence the recurring
nilerrlint hit.
golangci-lintkeeps reportingnilerron line 119 (also visible in the static-analysis hints). The(false, nil)return is the canonical "keep polling on transient error" pattern, so the warning is a false positive — but a one-line//nolint:nilerr(or a brief inline comment) on the return would prevent CI noise from recurring across future commits.♻️ Suggested tweak
pollErr := wait.PollUntilContextTimeout(ctx, interval, timeout, true, func(ctx context.Context) (bool, error) { if info, err = detectOutOfServiceTaintInfo(config); err != nil { - return false, nil // Keep retrying + // Returning nil keeps wait polling on transient errors; the latest err is preserved via closure. + return false, nil //nolint:nilerr } return true, nil // Success })🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/taints/taints.go` around lines 117 - 122, The linter flags the canonical retry return (false, nil) inside the wait.PollUntilContextTimeout callback in pollErr handling; update the anonymous func used by wait.PollUntilContextTimeout (the closure that calls detectOutOfServiceTaintInfo) to silence the false-positive by adding a one-line linter suppression comment (e.g., // nolint:nilerr or a short inline comment) next to the return false, nil so golangci-lint no longer reports nilerr for this deliberate retry path.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@go.mod`:
- Line 42: The transitive dependency github.com/moby/spdystream v0.5.0 remains
in the go.mod via client-go v0.33.7; update the PR by annotating that entry
(github.com/moby/spdystream v0.5.0) with a comment documenting the GHSA and
explaining that the vulnerable SPDY code path is unreachable in this codebase
(or add a SECURITY.md note), and create or reference a follow-up issue to
re-evaluate/upgrade once client-go moves to v0.36.0+ to eliminate the transitive
vulnerability.
In `@pkg/taints/taints_test.go`:
- Around line 96-103: The test for CreateOutOfServiceTaint is missing an
assertion that TimeAdded is nil; update the spec in
Context("CreateOutOfServiceTaint")/It(...) to assert that the returned taint
from CreateOutOfServiceTaint() has a nil TimeAdded (i.e.,
Expect(taint.TimeAdded).To(BeNil())), so the test enforces the contract that
CreateOutOfServiceTaint does not pre-populate TimeAdded.
---
Nitpick comments:
In `@pkg/taints/taints.go`:
- Around line 117-122: The linter flags the canonical retry return (false, nil)
inside the wait.PollUntilContextTimeout callback in pollErr handling; update the
anonymous func used by wait.PollUntilContextTimeout (the closure that calls
detectOutOfServiceTaintInfo) to silence the false-positive by adding a one-line
linter suppression comment (e.g., // nolint:nilerr or a short inline comment)
next to the return false, nil so golangci-lint no longer reports nilerr for this
deliberate retry path.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: f8ad9f94-1590-4fc2-9642-ecee60206ef0
📒 Files selected for processing (3)
go.modpkg/taints/taints.gopkg/taints/taints_test.go
| pollErr := wait.PollUntilContextTimeout(ctx, interval, timeout, true, func(ctx context.Context) (bool, error) { | ||
| if info, err = detectOutOfServiceTaintInfo(config); err != nil { | ||
| return false, nil // Keep retrying | ||
| } | ||
| return true, nil // Success | ||
| }) | ||
|
|
||
| // Respect context cancellation - return poll error so caller knows operation was cancelled | ||
| if pollErr != nil && (errors.Is(pollErr, context.Canceled) || errors.Is(pollErr, context.DeadlineExceeded)) { | ||
| return info, pollErr | ||
| } | ||
| // Return info and internal error: nil on success, or last failure on timeout (more specific than generic timeout) | ||
| return info, err |
There was a problem hiding this comment.
Internal timeout discards the more informative err.
The trailing comment ("or last failure on timeout (more specific than generic timeout)") describes the desired behavior, but the branch on line 125 catches context.DeadlineExceeded and returns pollErr for both parent-context deadline expiry and the internal 10s timeout from PollUntilContextTimeout. In the inner-timeout case, the captured err from the last detectOutOfServiceTaintInfo call is strictly more useful than a bare "context deadline exceeded".
Distinguish the two by inspecting the parent ctx:
🔧 Suggested fix
- // Respect context cancellation - return poll error so caller knows operation was cancelled
- if pollErr != nil && (errors.Is(pollErr, context.Canceled) || errors.Is(pollErr, context.DeadlineExceeded)) {
- return info, pollErr
- }
- // Return info and internal error: nil on success, or last failure on timeout (more specific than generic timeout)
- return info, err
+ // Respect parent context cancellation/deadline so the caller knows the operation was aborted externally.
+ if ctxErr := ctx.Err(); ctxErr != nil {
+ return info, ctxErr
+ }
+ // On the internal 10s timeout (or success), prefer the last detection error (more specific than the generic
+ // "context deadline exceeded" returned by PollUntilContextTimeout). Returns nil on success.
+ return info, err🧰 Tools
🪛 golangci-lint (2.12.1)
[error] 119-119: error is not nil (line 118) but it returns nil
(nilerr)
|
New changes are detected. LGTM label has been removed. |
clobrano
left a comment
There was a problem hiding this comment.
I left a couple of comments. Moreover, also CodeRabbit gave interesting analysis, we should address its remarks :)
| interval := 2 * time.Second // retry every 2 seconds | ||
| timeout := 10 * time.Second // for a period of 10 seconds |
There was a problem hiding this comment.
I would appreciate if the operators could decide the timing independently.
We could extract these values as package-level constants, or (better) pass them as arguments
| Expect(removed).To(BeTrue()) | ||
| Expect(newTaints).To(HaveLen(2)) | ||
| Expect(newTaints).To(ContainElement(taint1)) | ||
| Expect(newTaints).To(ContainElement(taint3)) |
There was a problem hiding this comment.
Here we should also test that taint2 and duplicate have been removed
There was a problem hiding this comment.
Used ConsistOf instead. wdyt?
|
|
||
| Context("getOutOfServiceTaintInfo", func() { | ||
| When("version is supported but not GA", func() { | ||
| DescribeTable("should return Supported=true, GA=false", |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: Tzif-Morgen The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
| }) | ||
|
|
||
| // Respect context cancellation - return poll error so caller knows operation was cancelled | ||
| if pollErr != nil && (errors.Is(pollErr, context.Canceled) || errors.Is(pollErr, context.DeadlineExceeded)) { |
There was a problem hiding this comment.
IIUC this might be a bug.
I'll break down how I understand it and correct me if I'm wrong.
Desired action:
- In case the context was cancelled outside of this method (likely before the 10 sec window) than we want to return the cancelled context error as it's the most relevant one.
- in case we catch an error during our 10 sec retry window we want to return that (last one caught) error.
(For example if we kept retrying because we got an error X from the API Server than we should return error X)
Actual action:
- Working as designed
- We store the error we want to return in
errvar, but we never return it because we will exit here - our pollErr would be ofcontext.DeadlineExceededbecause our own 10 sec deadline context exceeded.
Why we need this PR
Both FAR and SNR have duplicated taint-related code, particularly:
Moving this code to common allows both projects to share tested, consistent implementations.
Changes made
pkg/taintspackageNew package providing shared taint utilities:
TaintExists()- Check if a taint exists in a taint sliceFilterOutTaint()- Return new slice without matching taintsCreateOutOfServiceTaint()- Create K8s out-of-service taintInitOutOfServiceTaintFlagsWithRetry()- Initialize version detection flags with retry logicIsOutOfServiceTaintSupported(K8s 1.26+),IsOutOfServiceTaintGA(K8s 1.28+)Tests include version parsing with various formats and edge cases (11 test cases).
Upgrade k8s dependencies to v0.33.7
Aligns common with FAR and SNR k8s versions, enables modern APIs like
wait.PollUntilContextTimeout.Fix lease test compatibility
Removed
Kind/APIVersionassertions from lease tests. The new fake client (k8s v0.33) doesn't auto-populate TypeMeta fields like the old version did. These fields aren't used by manager logic. Seekubernetes-sigs/controller-runtime#3229.
Rename
DeleteTainttoFilterOutTaintClearer name reflecting that it's a pure function returning a filtered slice, not modifying the original or deleting from a node. Each operator can use this function based on their specific needs.
Which issue(s) this PR fixes
needed for RHWA-690
Summary by CodeRabbit
New Features
Chores
Tests