OCPBUGS-86025: Close HTTP response bodies to prevent goroutine and connection leaks#8560
OCPBUGS-86025: Close HTTP response bodies to prevent goroutine and connection leaks#8560vismishr wants to merge 1 commit into
Conversation
|
Pipeline controller notification For optional jobs, comment This repository is configured in: LGTM mode |
|
Skipping CI for Draft Pull Request. |
|
@vismishr: This pull request references Jira Issue OCPBUGS-86025, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository YAML (base), Central YAML (inherited) Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (3)
📝 WalkthroughWalkthroughThis PR adds three explicit closes for HTTP response bodies: a deferred 🚥 Pre-merge checks | ✅ 11 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (11 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
5180fc9 to
70a9e42
Compare
|
@vismishr: This pull request references Jira Issue OCPBUGS-86025, which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/jira refresh |
|
@vismishr: This pull request references Jira Issue OCPBUGS-86025, which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/jira refresh |
|
@vismishr: This pull request references Jira Issue OCPBUGS-86025, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #8560 +/- ##
==========================================
- Coverage 40.34% 40.34% -0.01%
==========================================
Files 755 755
Lines 93167 93172 +5
==========================================
Hits 37587 37587
- Misses 52877 52882 +5
Partials 2703 2703
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
support/konnectivityproxy/proxy_dialer.go (1)
62-69:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winAvoid deferred body close when returning a successful CONNECT tunnel.
At line 62,
defer resp.Body.Close()executes before returning the connection on status 200, which can interfere with the tunnel. Instead, explicitly close the body only in the error path (non-200 status).💡 Proposed fix
resp, err := http.ReadResponse(br, connectReq) if err != nil { conn.Close() return nil, err } - defer resp.Body.Close() if resp.StatusCode != 200 { + _ = resp.Body.Close() conn.Close() f := strings.SplitN(resp.Status, " ", 2) return nil, errors.New(f[1]) } return conn, nil🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@support/konnectivityproxy/proxy_dialer.go` around lines 62 - 69, Remove the deferred resp.Body.Close() that runs before returning a successful CONNECT tunnel; instead only close resp.Body in the non-200 error path where you also call conn.Close(). In the function handling the CONNECT response (the code around defer resp.Body.Close(), resp.StatusCode check, conn.Close(), and return statements), delete the defer and explicitly call resp.Body.Close() inside the if resp.StatusCode != 200 block before returning the error so the body is not closed when returning conn on success.support/konnectivityproxy/dialer.go (1)
329-344:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winRemove unconditional defer of response body close on the CONNECT success path.
At line 335,
defer res.Body.Close()closes the tunnel connection before the function returnskonnectivityConnectionat line 351. In HTTP CONNECT tunnels, the response body is the tunnel itself. Close the body only on non-200 and error paths; on success, return the connection without deferring the close.💡 Proposed fix
res, err := http.ReadResponse(br, nil) if err != nil { _ = konnectivityConnection.Close() return nil, fmt.Errorf("reading HTTP response from CONNECT to %s via proxy %s failed: %v", requestAddress, konnectivityServerAddress, err) } - defer res.Body.Close() if res.StatusCode != 200 { + _ = res.Body.Close() log.Info("Status code was not 200", "statusCode", res.StatusCode) _ = konnectivityConnection.Close() return nil, fmt.Errorf("proxy error from %s while dialing %s: %v", konnectivityServerAddress, requestAddress, res.Status) }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@support/konnectivityproxy/dialer.go` around lines 329 - 344, Remove the unconditional defer res.Body.Close() at the start of the CONNECT handling; the response body is the tunnel and must remain open on success. Instead, call res.Body.Close() explicitly on all error paths before returning (e.g., when res.StatusCode != 200 in the block that logs and returns an error referencing konnectivityServerAddress/requestAddress, and when br.Buffered() > 0 before returning the buffered-data error), and ensure the successful return of konnectivityConnection does not close res.Body. Keep references to konnectivityConnection, res, res.StatusCode, res.Status, br.Buffered(), requestAddress and konnectivityServerAddress so the fixes are applied to the correct places.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@support/konnectivityproxy/dialer.go`:
- Around line 329-344: Remove the unconditional defer res.Body.Close() at the
start of the CONNECT handling; the response body is the tunnel and must remain
open on success. Instead, call res.Body.Close() explicitly on all error paths
before returning (e.g., when res.StatusCode != 200 in the block that logs and
returns an error referencing konnectivityServerAddress/requestAddress, and when
br.Buffered() > 0 before returning the buffered-data error), and ensure the
successful return of konnectivityConnection does not close res.Body. Keep
references to konnectivityConnection, res, res.StatusCode, res.Status,
br.Buffered(), requestAddress and konnectivityServerAddress so the fixes are
applied to the correct places.
In `@support/konnectivityproxy/proxy_dialer.go`:
- Around line 62-69: Remove the deferred resp.Body.Close() that runs before
returning a successful CONNECT tunnel; instead only close resp.Body in the
non-200 error path where you also call conn.Close(). In the function handling
the CONNECT response (the code around defer resp.Body.Close(), resp.StatusCode
check, conn.Close(), and return statements), delete the defer and explicitly
call resp.Body.Close() inside the if resp.StatusCode != 200 block before
returning the error so the body is not closed when returning conn on success.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 3edace4b-06b4-4418-8092-d287468205b3
📒 Files selected for processing (3)
control-plane-operator/controllers/hostedcontrolplane/hostedcontrolplane_controller.gosupport/konnectivityproxy/dialer.gosupport/konnectivityproxy/proxy_dialer.go
…eaks Commit-Message-Assisted-by: Claude Opus 4.6 <noreply@anthropic.com>
70a9e42 to
0a73668
Compare
|
@vismishr: This pull request references Jira Issue OCPBUGS-86025, which is valid. 3 validation(s) were run on this bug
DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
|
@vismishr: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
|
/approve |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: muraee, vismishr The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
I now have all the information needed for a complete analysis. Here is the final report: Test Failure Analysis CompleteJob Information
Test Failure AnalysisErrorSummaryThe Root CauseThe
All four lines are in code paths that handle live network connections (HTTP CONNECT proxying, KAS health endpoints), which are inherently difficult to unit test without mocking the network layer. The PR is a correct resource-leak fix — the Recommendations
Evidence
|
What this PR does / why we need it:
Adds
defer resp.Body.Close()tohealthCheckKASEndpointand explicitresp.Body.Close()on error paths in the two CONNECT tunnel dialers, where HTTP response bodies are not closed after use. Without closing, each unclosed response leaks ~3 goroutines and holds an open TCP connection. InhealthCheckKASEndpoint(called from the HCP reconcile loop every ~10s), this accumulates ~26,000 leaked goroutines/day and ~200MB memory/day, potentially causing CPO pods to OOM on long-running management clusters.Files changed:
control-plane-operator/controllers/hostedcontrolplane/hostedcontrolplane_controller.go—healthCheckKASEndpointsupport/konnectivityproxy/dialer.go—KonnectivityDialer.DialContextsupport/konnectivityproxy/proxy_dialer.go—httpProxyDialer.DialAll three were identified via the
bodycloselinter (golangci-lint).Which issue(s) this PR fixes:
Fixes https://redhat.atlassian.net/browse/OCPBUGS-86025
Special notes for your reviewer:
healthCheckKASEndpoint: usesdefer resp.Body.Close()— standard pattern for a regular HTTP GET.resp.Body.Close()on error paths only. These are CONNECT tunnels where the response body is the tunnel connection — adeferwould close the tunnel on the success return path.Checklist:
Summary by CodeRabbit