Skip to content

wrap inner chunk reader error with %w to preserve errors.Is identity#4929

Open
johnelliott wants to merge 3 commits intomainfrom
hackathon/handlers-default-errwrap
Open

wrap inner chunk reader error with %w to preserve errors.Is identity#4929
johnelliott wants to merge 3 commits intomainfrom
hackathon/handlers-default-errwrap

Conversation

@johnelliott
Copy link
Copy Markdown
Contributor

@johnelliott johnelliott commented Apr 29, 2026

Summary

pkg/handlers/default.go:137 wraps the inner chunk reader error with %v, which breaks errors.Is for the wrapped cause:

-dataOrErr.Err = fmt.Errorf("%w: error reading chunk: %v", ErrProcessingWarning, err)
+dataOrErr.Err = fmt.Errorf("%w: error reading chunk: %w", ErrProcessingWarning, err)

Go 1.20+ supports multiple %w verbs in a single Errorf, and this repo is on Go 1.24 (go.mod). The outer ErrProcessingWarning wrap is preserved; the inner cause becomes inspectable via errors.Is.

Why this matters

isFatal() in pkg/handlers/handlers.go:482-493 classifies errors using errors.Is(err, context.Canceled) / errors.Is(err, context.DeadlineExceeded) / errors.Is(err, ErrProcessingFatal). With the current %v wrap, a context.DeadlineExceeded raised by chunkResult.Error() loses its identity in the chain — isFatal returns false for it, and the error falls into the non-fatal path where the file-processing loop continues to read chunks from a reader whose context is already dead.

In recent production log samples, ~71 out of 200 entries at handlers/handlers.go:428 were context deadline exceeded errors emitted as non-fatal warnings. With this PR they would correctly classify as fatal and terminate per-file processing on real timeouts.

Behavior change

This is not purely a logging fix. Switching the wrap will cause isFatal to return true for context.DeadlineExceeded (and any other previously-shadowed sentinel errors) in this code path. That changes runtime behavior:

  • Before: per-chunk timeout -> log non-fatal warning -> keep reading from a dead context -> probably more wasted work and more "non-critical" log noise until something else terminates.
  • After: per-chunk timeout -> isFatal returns true -> handleChunksWithError returns the error -> per-file processing terminates immediately.

This is arguably the correct behavior — once the context is cancelled, continuing to read chunks is pointless. But reviewers should confirm there are no implicit dependencies on the current "swallow-and-continue" semantics in source-specific handlers (S3, Git, Bitbucket, etc.) that would surface as new fatal errors at higher levels.

Companion PR

#4928 (one-line severity downgrade at handlers.go:428) addresses the alert-volume problem at the call site. This PR addresses the underlying classification bug. They are independent and can land in any order:

  • Land severity-downgrade alone -> log volume drops, behavior unchanged, deadline-exceeded errors keep silently classifying as non-fatal warnings.
  • Land errwrap alone -> per-file processing correctly terminates on real timeouts; remaining non-fatal errors still log at Error.
  • Both -> intended end state.

Out-of-scope follow-ups (flagged for later)

  • pkg/handlers/default.go:139 has the same pattern wrapping ErrProcessingFatal with %v for writeErr. The fatal path returns immediately so the errors.Is identity is less load-bearing there, but consistency would argue for the same %w change. Left out of this PR to keep scope tight.
  • pkg/handlers/archive.go:openArchive and pkg/handlers/rpm.go:107 return raw errors that should be wrapped explicitly with ErrProcessingWarning so the intent is documented at the source. No runtime change, but improves the class story.

Test plan

  • CI passes
  • Verify errors.Is(err, context.DeadlineExceeded) returns true for a synthesized chunk-reader timeout (existing handler tests likely don't exercise this)
  • Manual: confirm per-file processing terminates on context deadline rather than emitting repeat non-fatal warnings

Note

Medium Risk
Changes error-wrapping semantics in the chunk-read loop, which can change runtime control flow by causing previously-misclassified per-chunk cancellations/timeouts to terminate file processing. Risk is limited in scope but affects core handler error classification.

Overview
Fixes defaultHandler chunk-read error wrapping to use %w for the inner cause, preserving errors.Is identity alongside the existing ErrProcessingWarning wrapper.

Adds a regression test ensuring wrapped chunk-reader errors remain inspectable (e.g., context.Canceled/context.DeadlineExceeded) and that isFatal classification follows the preserved inner cause.

Reviewed by Cursor Bugbot for commit abe0e68. Bugbot is set up for automated code reviews on this repo. Configure here.

@johnelliott johnelliott force-pushed the hackathon/handlers-default-errwrap branch from d38f111 to b0679ff Compare April 29, 2026 17:13
@johnelliott johnelliott marked this pull request as ready for review April 29, 2026 17:27
@johnelliott johnelliott requested a review from a team April 29, 2026 17:27
@johnelliott johnelliott requested review from a team as code owners April 29, 2026 17:27
johnelliott added a commit that referenced this pull request Apr 29, 2026
…entity

pkg/handlers/default.go:139 wrapped the CancellableWrite failure with %v,
which dropped context.DeadlineExceeded / context.Canceled from the
errors.Is chain on the returned ErrProcessingFatal. The downstream
measureLatencyAndHandleErrors classifier uses
errors.Is(err, context.DeadlineExceeded) to increment the timeout metric
and apply the "error processing chunk" framing; with %v that branch was
unreachable for this failure path, causing the timeout metric to
under-count and the chunk-processing framing to be missing.

Switching to %w preserves both the outer ErrProcessingFatal wrap (so
isFatal still classifies correctly) and the inner ctx.Err() identity
(so the timeout classifier fires). Multiple %w verbs in a single
fmt.Errorf are supported on Go 1.20+; this repo is on Go 1.24.

Same shape as #4929's fix at line 137 but feeds a different
classifier (metric correctness rather than control flow).

Adds a regression test that constructs a dead context, drives a
chunk-error through handleNonArchiveContent against an unbuffered
output channel to force CancellableWrite to fail, and asserts both
the outer ErrProcessingFatal wrap and the inner ctx.Err() identity
are preserved for both DeadlineExceeded and Canceled.
casey-tran
casey-tran previously approved these changes Apr 29, 2026
@casey-tran casey-tran dismissed their stale review April 29, 2026 18:50

Check assert

Comment on lines +169 to +172
assert.True(t, errors.Is(got[0].Err, ErrProcessingWarning),
"outer ErrProcessingWarning wrap should be preserved")
assert.True(t, errors.Is(got[0].Err, tc.innerErr),
"inner cause should be inspectable via errors.Is")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to use assert.ErrorIs for these?

@johnelliott johnelliott force-pushed the hackathon/handlers-default-errwrap branch from b0679ff to abe0e68 Compare May 4, 2026 20:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants