Skip to content

Konnectivity retry proxy connection on timeout#8579

Draft
YamunadeviShanmugam wants to merge 3 commits into
openshift:mainfrom
YamunadeviShanmugam:fix-add-startup-probe-konnectivity-sock5-sidecar
Draft

Konnectivity retry proxy connection on timeout#8579
YamunadeviShanmugam wants to merge 3 commits into
openshift:mainfrom
YamunadeviShanmugam:fix-add-startup-probe-konnectivity-sock5-sidecar

Conversation

@YamunadeviShanmugam
Copy link
Copy Markdown
Contributor

@YamunadeviShanmugam YamunadeviShanmugam commented May 22, 2026

This PR fixes the intermittent E2E test failures where the konnectivity-proxy-socks5 sidecar container crashes and restarts during cluster bootstrap. Fix includes retry loop, native Kubernetes startup probes, and explicit component dependency ordering.
Proxy Internal Retry Logic

  • Added a new retry infrastructure with exponential backoff to handle transient connection errors safely. The container process now stays alive while waiting for the network tunnel to become available.

Native Startup Probes
konnectivity-container.go: Added a native TCP startup probe to all Konnectivity sidecar containers. This gives the sidecars a safe 60-second window to initialize during cluster bootstrap before any readiness or liveness checks trigger.

Component Dependency Ordering
Deployment graph is updated with dependency check. The openshift-oauth-apiserver pods will now explicitly wait for the konnectivity-agent to be completely Available and RolloutComplete before they begin scheduling.

Test Fixtures are updated accordingly

Summary by CodeRabbit

  • New Features

    • Enhanced Konnectivity SOCKS5 proxy guard with automatic retries/backoff and improved startup/shutdown flow.
    • Added TCP startup health check with a default serving-port fallback.
    • OAuth API server now declares the Konnectivity agent as a dependency.
  • Bug Fixes

    • Startup errors now propagate and exit cleanly instead of panicking.
  • Tests

    • Added extensive tests for guard/dialing behavior, transient-error detection, retries, CLI flags, container construction, probes, and graceful shutdown.

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: LGTM mode

@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 22, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 22, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 22, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: bf4a9784-e827-47b4-9a8d-2533271e6baa

📥 Commits

Reviewing files that changed from the base of the PR and between 922b57f and 47ddf7d.

📒 Files selected for processing (3)
  • konnectivity-socks5-proxy/guard_test.go
  • konnectivity-socks5-proxy/main_test.go
  • support/controlplane-component/konnectivity-container_test.go

📝 Walkthrough

Walkthrough

This PR adds a resilient guard for the Konnectivity SOCKS5 proxy that bootstraps a Konnectivity dialer, verifies TCP reachability, starts the SOCKS5 server, and retries on transient errors using exponential backoff. It moves proxy startup to a guarded signal-aware flow in main, replaces panic-based startup with error propagation, adds TCP StartupProbes for the konnectivity container (defaulting serving port to 8090 when unset), and declares the konnectivity agent as an explicit dependency of the OAuth API server component. Extensive unit and integration-style tests are included.

Sequence Diagram(s)

sequenceDiagram
  participant SignalHandler
  participant GuardLoop as KonnectivityGuard
  participant K8sAPI as KubernetesAPI
  participant Dialer as KonnectivityDialer
  participant Socks5 as SOCKS5Server

  SignalHandler->>GuardLoop: start(ctx)
  GuardLoop->>K8sAPI: load kube config
  GuardLoop->>Dialer: tryBootstrapKonnectivity()
  Dialer->>K8sAPI: use controller-runtime client
  Dialer->>KonnectivityServer: dial TCP host:port (verify)
  GuardLoop->>Socks5: start(server, Dialer)
  Socks5-->>GuardLoop: exit error
  GuardLoop->>GuardLoop: isTransientKonnectivityError?
  GuardLoop->>GuardLoop: backoff & retry (on transient)
Loading

Suggested reviewers

  • sdminonne
  • bryan-cox
  • clebs
🚥 Pre-merge checks | ✅ 11 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 6.25% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (11 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: adding retry logic for konnectivity proxy connection timeouts to prevent container crashes during bootstrap.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed The PR adds three new test files using standard Go testing (testing.T and t.Run), not Ginkgo It/Describe patterns. The check does not apply to these files.
Test Structure And Quality ✅ Passed Test files are standard Go unit tests with proper structure: single responsibility per t.Run subtest, defer-based cleanup, timeouts on async operations, and follow existing codebase patterns.
Microshift Test Compatibility ✅ Passed No new Ginkgo e2e tests added in this PR—only standard Go unit tests with *testing.T.
Single Node Openshift (Sno) Test Compatibility ✅ Passed No Ginkgo e2e tests added in this PR. All three new test files use standard Go testing.T framework with unit-level scope and no multi-node cluster assumptions.
Topology-Aware Scheduling Compatibility ✅ Passed No topology-breaking scheduling constraints introduced. Multi-zone spread anti-affinity only applies when replicas > 1, protecting SNO.
Ote Binary Stdout Contract ✅ Passed PR uses controller-runtime's zap wrapper which defaults to os.Stderr. All logging outputs to stderr, not stdout. No non-JSON stdout writes found.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed PR adds only standard Go unit tests (not Ginkgo e2e tests), so IPv6 compatibility check is not applicable.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 22, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: YamunadeviShanmugam
Once this PR has been reviewed and has the lgtm label, please assign jparrill for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added area/control-plane-operator Indicates the PR includes changes for the control plane operator - in an OCP release area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release and removed do-not-merge/needs-area labels May 22, 2026
The proxy was exiting hard on startup timeouts. This change adds
exponential backoff retry logic with startup probes.
@YamunadeviShanmugam YamunadeviShanmugam force-pushed the fix-add-startup-probe-konnectivity-sock5-sidecar branch from e81da62 to c1f6355 Compare May 22, 2026 17:40
@YamunadeviShanmugam
Copy link
Copy Markdown
Contributor Author

/test all

@codecov
Copy link
Copy Markdown

codecov Bot commented May 22, 2026

Codecov Report

❌ Patch coverage is 65.36313% with 62 lines in your changes missing coverage. Please review.
✅ Project coverage is 40.62%. Comparing base (1d34880) to head (47ddf7d).
⚠️ Report is 25 commits behind head on main.

Files with missing lines Patch % Lines
konnectivity-socks5-proxy/main.go 0.00% 32 Missing ⚠️
konnectivity-socks5-proxy/guard.go 76.03% 29 Missing ⚠️
...hostedcontrolplane/v2/oauth_apiserver/component.go 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #8579      +/-   ##
==========================================
+ Coverage   40.37%   40.62%   +0.25%     
==========================================
  Files         755      756       +1     
  Lines       93227    93395     +168     
==========================================
+ Hits        37644    37946     +302     
+ Misses      52879    52736     -143     
- Partials     2704     2713       +9     
Files with missing lines Coverage Δ
...t/controlplane-component/konnectivity-container.go 69.43% <100.00%> (+69.43%) ⬆️
...hostedcontrolplane/v2/oauth_apiserver/component.go 27.58% <0.00%> (ø)
konnectivity-socks5-proxy/guard.go 76.03% <76.03%> (ø)
konnectivity-socks5-proxy/main.go 46.47% <0.00%> (+46.47%) ⬆️

... and 3 files with indirect coverage changes

Flag Coverage Δ
cmd-support 34.84% <100.00%> (+0.49%) ⬆️
cpo-hostedcontrolplane 41.76% <0.00%> (ø)
cpo-other 40.31% <ø> (ø)
hypershift-operator 50.72% <ø> (ø)
other 32.49% <60.13%> (+0.95%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (3)
konnectivity-socks5-proxy/guard_test.go (2)

28-58: ⚡ Quick win

Consider refactoring to a table-driven test.

The test cases all follow the same structure (create error, call isTransientKonnectivityError, assert result). Table-driven tests improve maintainability and make it easier to add new cases. As per coding guidelines, prefer table-driven tests where possible.

♻️ Proposed table-driven refactor
 func TestIsTransientKonnectivityError(t *testing.T) {
-	t.Run("When error is nil, it should not be transient", func(t *testing.T) {
-		g := NewGomegaWithT(t)
-		g.Expect(isTransientKonnectivityError(nil)).To(BeFalse())
-	})
-
-	t.Run("When error is context deadline exceeded, it should be transient", func(t *testing.T) {
-		g := NewGomegaWithT(t)
-		g.Expect(isTransientKonnectivityError(context.DeadlineExceeded)).To(BeTrue())
-	})
-
-	t.Run("When error is connection refused, it should be transient", func(t *testing.T) {
-		g := NewGomegaWithT(t)
-		g.Expect(isTransientKonnectivityError(fmt.Errorf("dial tcp: connection refused"))).To(BeTrue())
-	})
-
-	t.Run("When error is syscall ECONNREFUSED, it should be transient", func(t *testing.T) {
-		g := NewGomegaWithT(t)
-		g.Expect(isTransientKonnectivityError(syscall.ECONNREFUSED)).To(BeTrue())
-	})
-
-	t.Run("When error is validation failure, it should not be transient", func(t *testing.T) {
-		g := NewGomegaWithT(t)
-		g.Expect(isTransientKonnectivityError(errors.New("failed validation: KonnectivityHost is required"))).To(BeFalse())
-	})
-
-	t.Run("When error is file not found, it should not be transient", func(t *testing.T) {
-		g := NewGomegaWithT(t)
-		g.Expect(isTransientKonnectivityError(os.ErrNotExist)).To(BeFalse())
-	})
+	testCases := []struct {
+		name       string
+		err        error
+		isTransient bool
+	}{
+		{
+			name:       "When error is nil, it should not be transient",
+			err:        nil,
+			isTransient: false,
+		},
+		{
+			name:       "When error is context deadline exceeded, it should be transient",
+			err:        context.DeadlineExceeded,
+			isTransient: true,
+		},
+		{
+			name:       "When error is connection refused, it should be transient",
+			err:        fmt.Errorf("dial tcp: connection refused"),
+			isTransient: true,
+		},
+		{
+			name:       "When error is syscall ECONNREFUSED, it should be transient",
+			err:        syscall.ECONNREFUSED,
+			isTransient: true,
+		},
+		{
+			name:       "When error is validation failure, it should not be transient",
+			err:        errors.New("failed validation: KonnectivityHost is required"),
+			isTransient: false,
+		},
+		{
+			name:       "When error is file not found, it should not be transient",
+			err:        os.ErrNotExist,
+			isTransient: false,
+		},
+	}
+
+	for _, tc := range testCases {
+		t.Run(tc.name, func(t *testing.T) {
+			g := NewGomegaWithT(t)
+			if tc.isTransient {
+				g.Expect(isTransientKonnectivityError(tc.err)).To(BeTrue())
+			} else {
+				g.Expect(isTransientKonnectivityError(tc.err)).To(BeFalse())
+			}
+		})
+	}
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@konnectivity-socks5-proxy/guard_test.go` around lines 28 - 58, Refactor
TestIsTransientKonnectivityError into a table-driven test: create a slice of
cases with fields name, err and want (bool) containing the current scenarios
(nil, context.DeadlineExceeded, fmt.Errorf("dial tcp: connection refused"),
syscall.ECONNREFUSED, errors.New("failed validation..."), os.ErrNotExist), then
loop over cases calling t.Run(case.name, func(t *testing.T){ g :=
NewGomegaWithT(t);
g.Expect(isTransientKonnectivityError(case.err)).To(Equal(case.want)) }); keep
the test function name TestIsTransientKonnectivityError and reuse
isTransientKonnectivityError and NewGomegaWithT to preserve existing assertions.

95-103: ⚡ Quick win

Avoid mutating package-level state in tests.

Modifying the package-level dialKonnectivityServer variable violates the guideline to avoid global state in tests. While the defer restores the original value, this pattern is unsafe for parallel test execution and can cause race conditions. As per coding guidelines, tests should run with parallel execution enabled.

Consider refactoring the code under test to accept the dialer as a parameter (dependency injection) rather than relying on a package-level variable. This would make tests safer, support parallel execution, and improve testability.

Example approach:

// In guard.go, make dialKonnectivityServer a parameter or field
func retryDialKonnectivityServer(ctx context.Context, log logr.Logger, host string, port uint32, dialFunc func(string, uint32) error) error {
    // ... use dialFunc instead of global dialKonnectivityServer
}

// In test
func TestDialKonnectivityServerRetries(t *testing.T) {
    attempts := 0
    mockDial := func(host string, port uint32) error {
        attempts++
        if attempts < 2 {
            return fmt.Errorf("dial tcp: connection refused")
        }
        return dialKonnectivityServerTCP(host, port)
    }
    
    err := retryDialKonnectivityServer(ctx, log, "127.0.0.1", port, mockDial)
    // ... assertions
}

As per coding guidelines, avoid global state in tests and run tests with parallel execution.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@konnectivity-socks5-proxy/guard_test.go` around lines 95 - 103, The test
mutates the package-level dialKonnectivityServer which is unsafe for parallel
tests; refactor retryDialKonnectivityServer (in guard.go) to accept a dial
function parameter (e.g., dialFunc func(string,uint32) error or as a field on a
struct) and update the test to call retryDialKonnectivityServer with a local
mockDial closure that increments attempts and returns the simulated error on the
first call and delegates to the real dial (e.g., dialKonnectivityServerTCP) on
success; remove any package-level reassignment in tests and use dependency
injection to supply the mock instead.
support/controlplane-component/konnectivity-container.go (1)

184-198: ⚡ Quick win

Refactor to eliminate duplication and add defensive default case.

The HTTPS and Socks5 branches contain identical logic, violating DRY. Additionally, the switch lacks a default case—if opts.Mode is neither HTTPS nor Socks5, servingPort remains zero, creating an invalid probe configuration.

♻️ Proposed refactoring
-	var servingPort int32
-	switch opts.Mode {
-	case HTTPS:
-		if opts.HTTPSOptions.ServingPort != 0 {
-			servingPort = int32(opts.HTTPSOptions.ServingPort)
-		} else {
-			servingPort = 8090
-		}
-	case Socks5:
-		if opts.Socks5Options.ServingPort != 0 {
-			servingPort = int32(opts.Socks5Options.ServingPort)
-		} else {
-			servingPort = 8090
-		}
-	}
+	servingPort := int32(8090)
+	switch opts.Mode {
+	case HTTPS:
+		if opts.HTTPSOptions.ServingPort != 0 {
+			servingPort = int32(opts.HTTPSOptions.ServingPort)
+		}
+	case Socks5:
+		if opts.Socks5Options.ServingPort != 0 {
+			servingPort = int32(opts.Socks5Options.ServingPort)
+		}
+	default:
+		// Should never occur given validation at line 81-84, but defensive.
+		panic(fmt.Sprintf("unexpected konnectivity proxy mode: %s", opts.Mode))
+	}

As per coding guidelines, follow the 100-go-mistakes rules: prefer eliminating code duplication and using defensive programming patterns.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@support/controlplane-component/konnectivity-container.go` around lines 184 -
198, The servingPort selection duplicates identical logic for HTTPS and Socks5
and lacks a default branch, so update the code that sets servingPort (variable
servingPort using opts.Mode, HTTPS, Socks5, opts.HTTPSOptions.ServingPort,
opts.Socks5Options.ServingPort) to remove duplication by first selecting the
candidate port (e.g., from opts.HTTPSOptions.ServingPort or
opts.Socks5Options.ServingPort based on opts.Mode) and then applying a single
fallback to 8090 if the candidate is zero; also add a defensive default for the
switch (or handle unknown modes) to ensure servingPort is always set to a valid
default.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@konnectivity-socks5-proxy/guard.go`:
- Around line 59-60: The retry backoff currently uses time.Sleep(delay()) which
ignores ctx cancellation; update both retry sites in guard.go (where
time.Sleep(delay()) is called) to use a cancellable wait: replace
time.Sleep(delay()) with a select that waits on time.After(delay()) and
ctx.Done(), returning or breaking when ctx is cancelled. Ensure the surrounding
function (the loop that calls delay()) respects ctx cancellation (e.g., check
ctx.Err() or return on ctx.Done()) so retries exit immediately when the provided
ctx is cancelled.
- Around line 30-35: coreGuardBackoff currently has Steps==0 so DelayFunc
returns a constant delay and exponential growth never happens; update
coreGuardBackoff to set a positive Steps (e.g., 5 or 10) so Factor and Cap take
effect, and replace every time.Sleep(delay()) call (where delay is produced by
coreGuardBackoff.DelayFunc or similar) with a ctx-aware wait: compute delay :=
delay() then use select { case <-ctx.Done(): return ctx.Err() (or break out
appropriately); case <-time.After(delay): /* continue retry */ } so retries exit
promptly on cancellation; reference coreGuardBackoff, DelayFunc and the sites
using time.Sleep(delay()) when making these edits.

In `@konnectivity-socks5-proxy/main.go`:
- Around line 55-71: signals.SetupSignalHandler() cancels ctx but the current
runWithCoreGuard call blocks on server.ListenAndServe and never observes
ctx.Done(), so SIGTERM doesn't stop the proxy and context cancellation becomes a
fatal error; change the socks5 server start to create a net.Listener (net.Listen
with fmt.Sprintf(":%d", servingPort)), run server.Serve(listener) in a
goroutine, and in the main goroutine select on ctx.Done() vs serve result: on
ctx.Done() call listener.Close() and wait for Serve to return and then return
nil (treat context cancellation as clean shutdown), and if Serve returns a
non-context error propagate it as before; update the block that constructs
socks5.Config and calls server.ListenAndServe to use this listener/Serve + ctx
cancellation handling so signals.SetupSignalHandler(), runWithCoreGuard,
server.Serve, and listener.Close() coordinate graceful shutdown.

---

Nitpick comments:
In `@konnectivity-socks5-proxy/guard_test.go`:
- Around line 28-58: Refactor TestIsTransientKonnectivityError into a
table-driven test: create a slice of cases with fields name, err and want (bool)
containing the current scenarios (nil, context.DeadlineExceeded,
fmt.Errorf("dial tcp: connection refused"), syscall.ECONNREFUSED,
errors.New("failed validation..."), os.ErrNotExist), then loop over cases
calling t.Run(case.name, func(t *testing.T){ g := NewGomegaWithT(t);
g.Expect(isTransientKonnectivityError(case.err)).To(Equal(case.want)) }); keep
the test function name TestIsTransientKonnectivityError and reuse
isTransientKonnectivityError and NewGomegaWithT to preserve existing assertions.
- Around line 95-103: The test mutates the package-level dialKonnectivityServer
which is unsafe for parallel tests; refactor retryDialKonnectivityServer (in
guard.go) to accept a dial function parameter (e.g., dialFunc
func(string,uint32) error or as a field on a struct) and update the test to call
retryDialKonnectivityServer with a local mockDial closure that increments
attempts and returns the simulated error on the first call and delegates to the
real dial (e.g., dialKonnectivityServerTCP) on success; remove any package-level
reassignment in tests and use dependency injection to supply the mock instead.

In `@support/controlplane-component/konnectivity-container.go`:
- Around line 184-198: The servingPort selection duplicates identical logic for
HTTPS and Socks5 and lacks a default branch, so update the code that sets
servingPort (variable servingPort using opts.Mode, HTTPS, Socks5,
opts.HTTPSOptions.ServingPort, opts.Socks5Options.ServingPort) to remove
duplication by first selecting the candidate port (e.g., from
opts.HTTPSOptions.ServingPort or opts.Socks5Options.ServingPort based on
opts.Mode) and then applying a single fallback to 8090 if the candidate is zero;
also add a defensive default for the switch (or handle unknown modes) to ensure
servingPort is always set to a valid default.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 56b1509a-1ce4-4bfd-9308-e453f94028b4

📥 Commits

Reviewing files that changed from the base of the PR and between d24af10 and e81da62.

⛔ Files ignored due to path filters (40)
  • control-plane-operator/controllers/hostedcontrolplane/testdata/catalog-operator/AROSwift/zz_fixture_TestControlPlaneComponents_catalog_operator_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/catalog-operator/GCP/zz_fixture_TestControlPlaneComponents_catalog_operator_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/catalog-operator/IBMCloud/zz_fixture_TestControlPlaneComponents_catalog_operator_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/catalog-operator/TechPreviewNoUpgrade/zz_fixture_TestControlPlaneComponents_catalog_operator_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/catalog-operator/zz_fixture_TestControlPlaneComponents_catalog_operator_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/cluster-network-operator/AROSwift/zz_fixture_TestControlPlaneComponents_cluster_network_operator_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/cluster-network-operator/GCP/zz_fixture_TestControlPlaneComponents_cluster_network_operator_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/cluster-network-operator/IBMCloud/zz_fixture_TestControlPlaneComponents_cluster_network_operator_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/cluster-network-operator/TechPreviewNoUpgrade/zz_fixture_TestControlPlaneComponents_cluster_network_operator_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/cluster-network-operator/zz_fixture_TestControlPlaneComponents_cluster_network_operator_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/ingress-operator/AROSwift/zz_fixture_TestControlPlaneComponents_ingress_operator_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/ingress-operator/GCP/zz_fixture_TestControlPlaneComponents_ingress_operator_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/ingress-operator/IBMCloud/zz_fixture_TestControlPlaneComponents_ingress_operator_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/ingress-operator/TechPreviewNoUpgrade/zz_fixture_TestControlPlaneComponents_ingress_operator_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/ingress-operator/zz_fixture_TestControlPlaneComponents_ingress_operator_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/oauth-openshift/AROSwift/zz_fixture_TestControlPlaneComponents_oauth_openshift_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/oauth-openshift/GCP/zz_fixture_TestControlPlaneComponents_oauth_openshift_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/oauth-openshift/IBMCloud/zz_fixture_TestControlPlaneComponents_oauth_openshift_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/oauth-openshift/TechPreviewNoUpgrade/zz_fixture_TestControlPlaneComponents_oauth_openshift_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/oauth-openshift/zz_fixture_TestControlPlaneComponents_oauth_openshift_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/olm-operator/AROSwift/zz_fixture_TestControlPlaneComponents_olm_operator_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/olm-operator/GCP/zz_fixture_TestControlPlaneComponents_olm_operator_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/olm-operator/IBMCloud/zz_fixture_TestControlPlaneComponents_olm_operator_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/olm-operator/TechPreviewNoUpgrade/zz_fixture_TestControlPlaneComponents_olm_operator_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/olm-operator/zz_fixture_TestControlPlaneComponents_olm_operator_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/openshift-apiserver/AROSwift/zz_fixture_TestControlPlaneComponents_openshift_apiserver_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/openshift-apiserver/GCP/zz_fixture_TestControlPlaneComponents_openshift_apiserver_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/openshift-apiserver/IBMCloud/zz_fixture_TestControlPlaneComponents_openshift_apiserver_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/openshift-apiserver/TechPreviewNoUpgrade/zz_fixture_TestControlPlaneComponents_openshift_apiserver_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/openshift-apiserver/zz_fixture_TestControlPlaneComponents_openshift_apiserver_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/openshift-oauth-apiserver/AROSwift/zz_fixture_TestControlPlaneComponents_openshift_oauth_apiserver_controlplanecomponent.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/openshift-oauth-apiserver/GCP/zz_fixture_TestControlPlaneComponents_openshift_oauth_apiserver_controlplanecomponent.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/openshift-oauth-apiserver/IBMCloud/zz_fixture_TestControlPlaneComponents_openshift_oauth_apiserver_controlplanecomponent.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/openshift-oauth-apiserver/TechPreviewNoUpgrade/zz_fixture_TestControlPlaneComponents_openshift_oauth_apiserver_controlplanecomponent.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/openshift-oauth-apiserver/zz_fixture_TestControlPlaneComponents_openshift_oauth_apiserver_controlplanecomponent.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/packageserver/AROSwift/zz_fixture_TestControlPlaneComponents_packageserver_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/packageserver/GCP/zz_fixture_TestControlPlaneComponents_packageserver_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/packageserver/IBMCloud/zz_fixture_TestControlPlaneComponents_packageserver_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/packageserver/TechPreviewNoUpgrade/zz_fixture_TestControlPlaneComponents_packageserver_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/packageserver/zz_fixture_TestControlPlaneComponents_packageserver_deployment.yaml is excluded by !**/testdata/**
📒 Files selected for processing (5)
  • control-plane-operator/controllers/hostedcontrolplane/v2/oauth_apiserver/component.go
  • konnectivity-socks5-proxy/guard.go
  • konnectivity-socks5-proxy/guard_test.go
  • konnectivity-socks5-proxy/main.go
  • support/controlplane-component/konnectivity-container.go

Comment thread konnectivity-socks5-proxy/guard.go
Comment thread konnectivity-socks5-proxy/guard.go Outdated
Comment thread konnectivity-socks5-proxy/main.go
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 22, 2026

@YamunadeviShanmugam: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
konnectivity-socks5-proxy/guard_test.go (1)

79-110: 🏗️ Heavy lift

Test the production retry path instead of a local clone.

retryDialKonnectivityServer reimplements its own loop, so this test can stay green while bootstrapKonnectivity or runWithCoreGuard regress. Prefer injecting the dial/bootstrap dependency into the real code path and asserting retries there.

As per coding guidelines, "Always include unit tests when creating new functions or modifying existing ones" and "Unit test any code changes and additions."

Also applies to: 113-131

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@konnectivity-socks5-proxy/guard_test.go` around lines 79 - 110, The test
currently calls retryDialKonnectivityServer directly and stubs
dialKonnectivityServer, which doesn't exercise the real production path;
refactor the code to inject the dialing/bootstrap dependency (e.g., accept a
DialFunc or a KonnectivityBootstrap interface) into
bootstrapKonnectivity/runWithCoreGuard and update the test to call the real
entrypoint (bootstrapKonnectivity or runWithCoreGuard) with a test dial
implementation that fails once then succeeds, assert the entrypoint eventually
succeeds and that the injected dial was invoked multiple times; keep references
to dialKonnectivityServer, retryDialKonnectivityServer, bootstrapKonnectivity,
and runWithCoreGuard to locate and change signatures, and ensure tests restore
any global state and use context timeouts as before.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@konnectivity-socks5-proxy/guard.go`:
- Around line 36-37: Thread a context through bootstrapKonnectivity →
tryBootstrapKonnectivity → dialKonnectivityServerTCP so the TCP probe is
cancellable: add a ctx parameter to bootstrapKonnectivity and propagate it into
tryBootstrapKonnectivity and dialKonnectivityServerTCP, then replace the
net.DialTimeout(...) call inside dialKonnectivityServerTCP with
(&net.Dialer{Timeout: konnectivityDialTimeout}).DialContext(ctx, network, addr)
so the in-progress probe is aborted when ctx is cancelled; update all callers to
pass the ctx accordingly.

---

Nitpick comments:
In `@konnectivity-socks5-proxy/guard_test.go`:
- Around line 79-110: The test currently calls retryDialKonnectivityServer
directly and stubs dialKonnectivityServer, which doesn't exercise the real
production path; refactor the code to inject the dialing/bootstrap dependency
(e.g., accept a DialFunc or a KonnectivityBootstrap interface) into
bootstrapKonnectivity/runWithCoreGuard and update the test to call the real
entrypoint (bootstrapKonnectivity or runWithCoreGuard) with a test dial
implementation that fails once then succeeds, assert the entrypoint eventually
succeeds and that the injected dial was invoked multiple times; keep references
to dialKonnectivityServer, retryDialKonnectivityServer, bootstrapKonnectivity,
and runWithCoreGuard to locate and change signatures, and ensure tests restore
any global state and use context timeouts as before.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: a56c9242-6df5-4a69-85b4-450d2a131e1a

📥 Commits

Reviewing files that changed from the base of the PR and between e81da62 and c1f6355.

⛔ Files ignored due to path filters (40)
  • control-plane-operator/controllers/hostedcontrolplane/testdata/catalog-operator/AROSwift/zz_fixture_TestControlPlaneComponents_catalog_operator_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/catalog-operator/GCP/zz_fixture_TestControlPlaneComponents_catalog_operator_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/catalog-operator/IBMCloud/zz_fixture_TestControlPlaneComponents_catalog_operator_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/catalog-operator/TechPreviewNoUpgrade/zz_fixture_TestControlPlaneComponents_catalog_operator_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/catalog-operator/zz_fixture_TestControlPlaneComponents_catalog_operator_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/cluster-network-operator/AROSwift/zz_fixture_TestControlPlaneComponents_cluster_network_operator_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/cluster-network-operator/GCP/zz_fixture_TestControlPlaneComponents_cluster_network_operator_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/cluster-network-operator/IBMCloud/zz_fixture_TestControlPlaneComponents_cluster_network_operator_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/cluster-network-operator/TechPreviewNoUpgrade/zz_fixture_TestControlPlaneComponents_cluster_network_operator_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/cluster-network-operator/zz_fixture_TestControlPlaneComponents_cluster_network_operator_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/ingress-operator/AROSwift/zz_fixture_TestControlPlaneComponents_ingress_operator_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/ingress-operator/GCP/zz_fixture_TestControlPlaneComponents_ingress_operator_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/ingress-operator/IBMCloud/zz_fixture_TestControlPlaneComponents_ingress_operator_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/ingress-operator/TechPreviewNoUpgrade/zz_fixture_TestControlPlaneComponents_ingress_operator_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/ingress-operator/zz_fixture_TestControlPlaneComponents_ingress_operator_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/oauth-openshift/AROSwift/zz_fixture_TestControlPlaneComponents_oauth_openshift_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/oauth-openshift/GCP/zz_fixture_TestControlPlaneComponents_oauth_openshift_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/oauth-openshift/IBMCloud/zz_fixture_TestControlPlaneComponents_oauth_openshift_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/oauth-openshift/TechPreviewNoUpgrade/zz_fixture_TestControlPlaneComponents_oauth_openshift_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/oauth-openshift/zz_fixture_TestControlPlaneComponents_oauth_openshift_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/olm-operator/AROSwift/zz_fixture_TestControlPlaneComponents_olm_operator_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/olm-operator/GCP/zz_fixture_TestControlPlaneComponents_olm_operator_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/olm-operator/IBMCloud/zz_fixture_TestControlPlaneComponents_olm_operator_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/olm-operator/TechPreviewNoUpgrade/zz_fixture_TestControlPlaneComponents_olm_operator_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/olm-operator/zz_fixture_TestControlPlaneComponents_olm_operator_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/openshift-apiserver/AROSwift/zz_fixture_TestControlPlaneComponents_openshift_apiserver_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/openshift-apiserver/GCP/zz_fixture_TestControlPlaneComponents_openshift_apiserver_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/openshift-apiserver/IBMCloud/zz_fixture_TestControlPlaneComponents_openshift_apiserver_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/openshift-apiserver/TechPreviewNoUpgrade/zz_fixture_TestControlPlaneComponents_openshift_apiserver_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/openshift-apiserver/zz_fixture_TestControlPlaneComponents_openshift_apiserver_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/openshift-oauth-apiserver/AROSwift/zz_fixture_TestControlPlaneComponents_openshift_oauth_apiserver_controlplanecomponent.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/openshift-oauth-apiserver/GCP/zz_fixture_TestControlPlaneComponents_openshift_oauth_apiserver_controlplanecomponent.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/openshift-oauth-apiserver/IBMCloud/zz_fixture_TestControlPlaneComponents_openshift_oauth_apiserver_controlplanecomponent.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/openshift-oauth-apiserver/TechPreviewNoUpgrade/zz_fixture_TestControlPlaneComponents_openshift_oauth_apiserver_controlplanecomponent.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/openshift-oauth-apiserver/zz_fixture_TestControlPlaneComponents_openshift_oauth_apiserver_controlplanecomponent.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/packageserver/AROSwift/zz_fixture_TestControlPlaneComponents_packageserver_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/packageserver/GCP/zz_fixture_TestControlPlaneComponents_packageserver_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/packageserver/IBMCloud/zz_fixture_TestControlPlaneComponents_packageserver_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/packageserver/TechPreviewNoUpgrade/zz_fixture_TestControlPlaneComponents_packageserver_deployment.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/packageserver/zz_fixture_TestControlPlaneComponents_packageserver_deployment.yaml is excluded by !**/testdata/**
📒 Files selected for processing (5)
  • control-plane-operator/controllers/hostedcontrolplane/v2/oauth_apiserver/component.go
  • konnectivity-socks5-proxy/guard.go
  • konnectivity-socks5-proxy/guard_test.go
  • konnectivity-socks5-proxy/main.go
  • support/controlplane-component/konnectivity-container.go

Comment thread konnectivity-socks5-proxy/guard.go Outdated
  The proxy was exiting hard on startup timeouts.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
konnectivity-socks5-proxy/main.go (1)

94-96: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Treat bootstrap-time context cancellation as a clean exit.

If runWithCoreGuard returns context.Canceled while still bootstrapping, Lines 94-96 still print an error and os.Exit(1). That makes an intentional SIGTERM during startup look like a crash.

Suggested change
 import (
+	"context"
+	"errors"
 	"fmt"
 	"net"
 	"os"
@@
 		})
 		if err != nil {
+			if errors.Is(err, context.Canceled) {
+				return
+			}
 			fmt.Fprintf(os.Stderr, "Error: %v\n", err)
 			os.Exit(1)
 		}
 	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@konnectivity-socks5-proxy/main.go` around lines 94 - 96, When handling the
error returned from runWithCoreGuard, treat bootstrap-time cancellation as a
clean exit by checking if errors.Is(err, context.Canceled) (or err ==
context.Canceled) and, if so, do not print an error and exit 0; otherwise keep
the existing fmt.Fprintf(os.Stderr, ...) and os.Exit(1). Update the error
handling around runWithCoreGuard to perform this conditional check (and add an
import for the errors package if needed) so a SIGTERM during startup is treated
as a normal shutdown.
🧹 Nitpick comments (1)
konnectivity-socks5-proxy/main_test.go (1)

100-134: ⚡ Quick win

Exercise the production shutdown path instead of a copied closure.

This subtest reimplements the listener/Serve/ctx.Done() flow locally, so it can stay green even if NewStartCommand drifts. Extract that block into a small package-private helper and call the helper from both main.go and this test.

As per coding guidelines, "Unit test any code changes and additions".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@konnectivity-socks5-proxy/main_test.go` around lines 100 - 134, The test
duplicates the listener/Serve/ctx.Done() shutdown logic (serveFunc) instead of
exercising the real production path; refactor that logic into a package-private
helper (e.g., create a function serveWithGracefulShutdown(ctx context.Context,
servingPort uint32, server *socks5.Server) error) and replace the test-local
closure and the implementation in main.go to call this helper; update
NewStartCommand (or the function in main.go that currently inlines the logic) to
call serveWithGracefulShutdown so the test can simply invoke the real code path
and verify graceful shutdown, and adjust tests to pass a server instance and
listener port to the new helper.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@konnectivity-socks5-proxy/guard_test.go`:
- Around line 213-219: The test currently dials a hardcoded port 54321 which can
be occupied on CI; instead create a short-lived listener on "127.0.0.1:0" to get
an ephemeral port, read its actual port from listener.Addr(), close the listener
to free the port, then call dialKonnectivityServerTCP(ctx, "127.0.0.1", port)
and assert error with isTransientKonnectivityError; update the anonymous t.Run
block to use this ephemeral-port pattern and keep the same assertions using
dialKonnectivityServerTCP and isTransientKonnectivityError.

---

Duplicate comments:
In `@konnectivity-socks5-proxy/main.go`:
- Around line 94-96: When handling the error returned from runWithCoreGuard,
treat bootstrap-time cancellation as a clean exit by checking if errors.Is(err,
context.Canceled) (or err == context.Canceled) and, if so, do not print an error
and exit 0; otherwise keep the existing fmt.Fprintf(os.Stderr, ...) and
os.Exit(1). Update the error handling around runWithCoreGuard to perform this
conditional check (and add an import for the errors package if needed) so a
SIGTERM during startup is treated as a normal shutdown.

---

Nitpick comments:
In `@konnectivity-socks5-proxy/main_test.go`:
- Around line 100-134: The test duplicates the listener/Serve/ctx.Done()
shutdown logic (serveFunc) instead of exercising the real production path;
refactor that logic into a package-private helper (e.g., create a function
serveWithGracefulShutdown(ctx context.Context, servingPort uint32, server
*socks5.Server) error) and replace the test-local closure and the implementation
in main.go to call this helper; update NewStartCommand (or the function in
main.go that currently inlines the logic) to call serveWithGracefulShutdown so
the test can simply invoke the real code path and verify graceful shutdown, and
adjust tests to pass a server instance and listener port to the new helper.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 099bd414-6684-4f69-8d67-a453ed5f21b3

📥 Commits

Reviewing files that changed from the base of the PR and between c1f6355 and 922b57f.

📒 Files selected for processing (5)
  • konnectivity-socks5-proxy/guard.go
  • konnectivity-socks5-proxy/guard_test.go
  • konnectivity-socks5-proxy/main.go
  • konnectivity-socks5-proxy/main_test.go
  • support/controlplane-component/konnectivity-container_test.go

Comment on lines +213 to +219
t.Run("When port is not listening, it should return connection refused", func(t *testing.T) {
g := NewGomegaWithT(t)
// Try to connect to a port that's not listening (port 1 requires root)
err := dialKonnectivityServerTCP(context.Background(), "127.0.0.1", 54321)
g.Expect(err).To(HaveOccurred())
g.Expect(isTransientKonnectivityError(err)).To(BeTrue())
})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Avoid a fixed localhost port in the refusal test.

Line 216 assumes 54321 is unused. If that port is bound on a CI worker, this turns into a false negative instead of a connection-refused path. Reserve an ephemeral port with 127.0.0.1:0, capture it, close the listener, then dial that port.

Suggested change
 func TestDialKonnectivityServerTCP_ConnectionRefused(t *testing.T) {
 	t.Run("When port is not listening, it should return connection refused", func(t *testing.T) {
 		g := NewGomegaWithT(t)
-		// Try to connect to a port that's not listening (port 1 requires root)
-		err := dialKonnectivityServerTCP(context.Background(), "127.0.0.1", 54321)
+		ln, err := net.Listen("tcp", "127.0.0.1:0")
+		g.Expect(err).ToNot(HaveOccurred())
+
+		_, portStr, err := net.SplitHostPort(ln.Addr().String())
+		g.Expect(err).ToNot(HaveOccurred())
+		g.Expect(ln.Close()).To(Succeed())
+
+		var port uint32
+		_, err = fmt.Sscanf(portStr, "%d", &port)
+		g.Expect(err).ToNot(HaveOccurred())
+
+		err = dialKonnectivityServerTCP(context.Background(), "127.0.0.1", port)
 		g.Expect(err).To(HaveOccurred())
 		g.Expect(isTransientKonnectivityError(err)).To(BeTrue())
 	})
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
t.Run("When port is not listening, it should return connection refused", func(t *testing.T) {
g := NewGomegaWithT(t)
// Try to connect to a port that's not listening (port 1 requires root)
err := dialKonnectivityServerTCP(context.Background(), "127.0.0.1", 54321)
g.Expect(err).To(HaveOccurred())
g.Expect(isTransientKonnectivityError(err)).To(BeTrue())
})
t.Run("When port is not listening, it should return connection refused", func(t *testing.T) {
g := NewGomegaWithT(t)
ln, err := net.Listen("tcp", "127.0.0.1:0")
g.Expect(err).ToNot(HaveOccurred())
_, portStr, err := net.SplitHostPort(ln.Addr().String())
g.Expect(err).ToNot(HaveOccurred())
g.Expect(ln.Close()).To(Succeed())
var port uint32
_, err = fmt.Sscanf(portStr, "%d", &port)
g.Expect(err).ToNot(HaveOccurred())
err = dialKonnectivityServerTCP(context.Background(), "127.0.0.1", port)
g.Expect(err).To(HaveOccurred())
g.Expect(isTransientKonnectivityError(err)).To(BeTrue())
})
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@konnectivity-socks5-proxy/guard_test.go` around lines 213 - 219, The test
currently dials a hardcoded port 54321 which can be occupied on CI; instead
create a short-lived listener on "127.0.0.1:0" to get an ephemeral port, read
its actual port from listener.Addr(), close the listener to free the port, then
call dialKonnectivityServerTCP(ctx, "127.0.0.1", port) and assert error with
isTransientKonnectivityError; update the anonymous t.Run block to use this
ephemeral-port pattern and keep the same assertions using
dialKonnectivityServerTCP and isTransientKonnectivityError.

  The proxy was exiting hard on startup timeouts.
@hypershift-jira-solve-ci
Copy link
Copy Markdown

Now I have all the information. Let me compile the final report.

Test Failure Analysis Complete

Job Information

  • Prow Job: pull-ci-openshift-hypershift (GitHub Actions)
  • Build ID: 26380396062 (Lint), 26380396027 (Verify)
  • PR: #8579 — Konnectivity retry proxy connection on timeout
  • Branch: fix-add-startup-probe-konnectivity-sock5-sidecar
  • Failed Jobs: lint / Lint (job 77648367137), verify / Verify (job 77648367059)

Test Failure Analysis

Error

# Lint job (5 errors):
konnectivity-socks5-proxy/main_test.go:10:1: File is not properly formatted (gci)
support/controlplane-component/konnectivity-container_test.go:6:1: File is not properly formatted (gci)
konnectivity-socks5-proxy/guard_test.go:221:25: `cancelled` is a misspelling of `canceled` (misspell)
konnectivity-socks5-proxy/guard_test.go:389:25: `cancelled` is a misspelling of `canceled` (misspell)
konnectivity-socks5-proxy/guard_test.go:498:25: `cancelled` is a misspelling of `canceled` (misspell)

# Verify job (1 error):
support/controlplane-component/konnectivity-container_test.go: needs update

Summary

Both CI failures are caused by code style issues in the three new test files introduced by PR #8579. The lint job fails with 5 linter violations: two gci (Go import ordering) violations in main_test.go and konnectivity-container_test.go, and three misspell violations in guard_test.go where "cancelled" (British English) is used instead of the American English "canceled". The verify job fails because go fmt modifies konnectivity-container_test.go (reformatting it), and the subsequent git diff --exit-code check detects the working tree is dirty. These are purely cosmetic/style issues — no product bugs, no test logic errors, no infrastructure problems.

Root Cause

The root cause is three new test files that were not run through the project's linter and formatter before being committed:

  1. konnectivity-socks5-proxy/main_test.go — The import block does not follow the gci (Go import grouping/ordering) convention required by the project's .golangci.yml. The github.com/armon/go-socks5 import on line 10 is not in the expected group order (stdlib → external → internal).

  2. support/controlplane-component/konnectivity-container_test.go — Two independent issues:

    • The import block on line 6 (hyperv1 "github.com/openshift/hypershift/api/hypershift/v1beta1") violates gci import ordering rules.
    • The file is not go fmt-compliant. When make fmt runs (go fmt ./...), it modifies this file, causing the verify step's git diff --exit-code HEAD check to fail with "needs update".
  3. konnectivity-socks5-proxy/guard_test.go — Three test description strings use "cancelled" (British English spelling) at lines 221, 389, and 498:

    • "When context is cancelled during dial, it should abort immediately"
    • "When context is cancelled during bootstrap, it should return context error"
    • "When context is cancelled, it should return context error"

    The misspell linter enforces American English, requiring "canceled" (one 'l').

Recommendations
  1. Fix import ordering in both konnectivity-socks5-proxy/main_test.go and support/controlplane-component/konnectivity-container_test.go:

    • Run gci write --skip-generated -s standard -s default -s "prefix(github.com/openshift/hypershift)" <file> or use the project's make target to reorder imports per the gci configuration.
  2. Fix formatting in support/controlplane-component/konnectivity-container_test.go:

    • Run go fmt ./support/controlplane-component/konnectivity-container_test.go
  3. Fix misspellings in konnectivity-socks5-proxy/guard_test.go at lines 221, 389, and 498:

    • Replace all three occurrences of cancelled with canceled
  4. Before pushing, run the project's full lint and verify locally:

    make lint
    make fmt
    make verify
Evidence
Evidence Detail
Lint job exit code 2 (linter found issues)
Verify job exit code 1 (git tree dirty after go fmt)
gci violation #1 konnectivity-socks5-proxy/main_test.go:10:1 — import github.com/armon/go-socks5 not in correct group
gci violation #2 support/controlplane-component/konnectivity-container_test.go:6:1 — import hyperv1 "github.com/openshift/hypershift/api/hypershift/v1beta1" not in correct group
misspell violation #1 guard_test.go:221:25"When context is cancelled during dial..."
misspell violation #2 guard_test.go:389:25"When context is cancelled during bootstrap..."
misspell violation #3 guard_test.go:498:25"When context is cancelled, it should..."
Verify dirty file go fmt ./... reformatted support/controlplane-component/konnectivity-container_test.go, detected by git diff --exit-code HEAD
Total linter issues (hidden) 2 additional cancelled misspellings were hidden by --max-same-issues (5 total, 3 shown)
Files affected All 3 are new test files added by this PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/control-plane-operator Indicates the PR includes changes for the control plane operator - in an OCP release area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant