Skip to content

feat(tracing): skip OTLP error recording for selected errors#1317

Closed
sylr wants to merge 2 commits into
mainfrom
feat/discard-errors-from-traces
Closed

feat(tracing): skip OTLP error recording for selected errors#1317
sylr wants to merge 2 commits into
mainfrom
feat/discard-errors-from-traces

Conversation

@sylr

@sylr sylr commented Apr 1, 2026

Copy link
Copy Markdown
Contributor

Add TraceOption with SkipErrorRecordingIf and WithSpanStartOptions so Trace and TraceWithMetric can omit otlp.RecordError for expected cases (e.g. "not found" on ReadLogWithIdempotencyKey).

Migrate ImportLog span options to WithSpanStartOptions. Inline the InsertTransaction metric finalizer into the traced callback because TraceWithMetric now takes trace options as its trailing variadic.

Made-with: Cursor

Add TraceOption with SkipErrorRecordingIf and WithSpanStartOptions so
Trace and TraceWithMetric can omit otlp.RecordError for expected cases
(e.g. not found on ReadLogWithIdempotencyKey).

Migrate ImportLog span options to WithSpanStartOptions. Inline the
InsertTransaction metric finalizer into the traced callback because
TraceWithMetric now takes trace options as its trailing variadic.

Made-with: Cursor
Signed-off-by: Sylvain Rabot <sylvain@formance.com>
@sylr sylr requested a review from a team as a code owner April 1, 2026 14:08
@coderabbitai

coderabbitai Bot commented Apr 1, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Walkthrough

Refactors tracing APIs to accept TraceOption objects. Introduces WithSpanStartOptions and SkipErrorRecordingIf options, conditions error recording on predicates, and updates call sites to pass span start options and skip-not-found behavior; removes the finalizer callback pattern in TraceWithMetric.

Changes

Tracing changes

Layer / File(s) Summary
Tracing core API and options
internal/tracing/tracing.go
Introduced TraceOption interface and WithSpanStartOptions, SkipErrorRecordingIf. Trace and TraceWithMetric now accept ...TraceOption, build traceSettings, pass spanStartOptions into tracer.Start, and skip observe.RecordError when a predicate matches.
Transaction insertion tracing update
internal/storage/ledger/transactions.go
Moved span attribute setting for transaction.id, transaction.timestamp, and transaction.reference into the main TraceWithMetric callback and removed the trailing finalizer callback.
ReadLog skip-not-found option
internal/storage/ledger/logs.go
ReadLogWithIdempotencyKey adds tracing.SkipErrorRecordingIf(postgres.IsNotFoundError) so Postgres not-found errors are not recorded by observability.
Controller span start options usage
internal/controller/ledger/controller_default.go
Replaced inline trace.WithNewRoot() and trace.WithLinks(trace.LinkFromContext(ctx)) parameters with tracing.WithSpanStartOptions(...) when invoking tracing.Trace.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 In tunnels of code I hop and sing,
Options in pockets, no trailing string.
Spans start tidy, errors politely skipped,
Metrics still counted, logs tightly zipped.
A nibble of change—tracing freshly clipped.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title directly describes the main feature: adding ability to skip OTLP error recording for selected errors via TraceOption API, which is the primary change across all modified files.
Description check ✅ Passed The description accurately summarizes the key changes: introduces TraceOption API with SkipErrorRecordingIf and WithSpanStartOptions, migrates ImportLog to use these options, and inlines InsertTransaction finalizer due to signature changes.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/discard-errors-from-traces

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.12.2)

level=error msg="[linters_context] typechecking error: pattern ./...: directory prefix . does not contain main module or its selected dependencies"


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
internal/tracing/tracing.go (1)

127-163: Well-designed TraceOption API with clean extensibility.

The functional options pattern is idiomatic Go, the OR semantics for multiple predicates is clearly documented, and the implementation correctly returns errors to callers unchanged while only affecting OTLP recording.

One optional defensive improvement: consider guarding against nil predicates to prevent a panic if a caller accidentally passes nil.

🛡️ Optional: guard against nil predicate
 func SkipErrorRecordingIf(predicate func(error) bool) TraceOption {
+	if predicate == nil {
+		return traceOptionFunc(func(*traceSettings) {}) // no-op
+	}
 	return traceOptionFunc(func(s *traceSettings) {
 		s.skipRecordErrorIfs = append(s.skipRecordErrorIfs, predicate)
 	})
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/tracing/tracing.go` around lines 127 - 163, Guard against nil
predicates to avoid panics: in SkipErrorRecordingIf, ignore any nil predicate
arguments instead of appending them to traceSettings.skipRecordErrorIfs;
additionally make shouldSkipErrorRecording resilient by checking for nil before
calling predicate (i.e., skip nil entries). Update references in
SkipErrorRecordingIf and traceSettings.shouldSkipErrorRecording to perform these
nil checks so a mistakenly passed nil predicate won’t cause a runtime panic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@internal/tracing/tracing.go`:
- Around line 127-163: Guard against nil predicates to avoid panics: in
SkipErrorRecordingIf, ignore any nil predicate arguments instead of appending
them to traceSettings.skipRecordErrorIfs; additionally make
shouldSkipErrorRecording resilient by checking for nil before calling predicate
(i.e., skip nil entries). Update references in SkipErrorRecordingIf and
traceSettings.shouldSkipErrorRecording to perform these nil checks so a
mistakenly passed nil predicate won’t cause a runtime panic.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: ed6a493d-1ea4-451e-90d6-2e2a61ba2042

📥 Commits

Reviewing files that changed from the base of the PR and between 62ad458 and 229bc5e.

📒 Files selected for processing (4)
  • internal/controller/ledger/controller_default.go
  • internal/storage/ledger/logs.go
  • internal/storage/ledger/transactions.go
  • internal/tracing/tracing.go

@codecov

codecov Bot commented Apr 1, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 65.38462% with 9 lines in your changes missing coverage. Please review.
✅ Project coverage is 78.93%. Comparing base (6f8d486) to head (a51e3de).

Files with missing lines Patch % Lines
internal/tracing/tracing.go 70.83% 4 Missing and 3 partials ⚠️
internal/controller/ledger/controller_default.go 0.00% 1 Missing ⚠️
internal/storage/ledger/transactions.go 0.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1317      +/-   ##
==========================================
- Coverage   80.81%   78.93%   -1.88%     
==========================================
  Files         205      205              
  Lines       11131    11265     +134     
==========================================
- Hits         8995     8892     -103     
- Misses       1594     1720     +126     
- Partials      542      653     +111     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

gfyrag
gfyrag previously approved these changes Apr 2, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
internal/storage/ledger/transactions.go (1)

140-143: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use the actual transaction ID value for trace attributes.

On Line 141, fmt.Sprint(tx.ID) serializes the pointer, not the ID value. This records incorrect trace data for transaction.id.

Suggested fix
+import "strconv"
...
-			trace.SpanFromContext(ctx).SetAttributes(
-				attribute.String("transaction.id", fmt.Sprint(tx.ID)),
+			txID := ""
+			if tx.ID != nil {
+				txID = strconv.FormatUint(*tx.ID, 10)
+			}
+			trace.SpanFromContext(ctx).SetAttributes(
+				attribute.String("transaction.id", txID),
 				attribute.String("transaction.timestamp", tx.Timestamp.Format(time.RFC3339Nano)),
 				attribute.String("transaction.reference", tx.Reference),
 			)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/storage/ledger/transactions.go` around lines 140 - 143, The trace
attribute for transaction ID is recording the pointer address because the code
uses fmt.Sprint(tx.ID); change it to serialize the actual ID value (e.g.,
dereference tx.ID or convert the underlying integer/string value) before passing
to attribute.String so trace.SpanFromContext(ctx).SetAttributes receives the
real ID; update the statement that builds attribute.String("transaction.id",
...) to use the concrete value (for example fmt.Sprint(*tx.ID) or
strconv.FormatInt(*tx.ID, 10) depending on tx.ID's type) alongside the existing
transaction.timestamp and transaction.reference fields.
internal/tracing/tracing.go (2)

157-158: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Update the exported comment to match the implementation.

SkipErrorRecordingIf now gates observe.RecordError, not otlp.RecordError, so the current doc text is stale and misleading.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/tracing/tracing.go` around lines 157 - 158, The exported comment for
SkipErrorRecordingIf is stale—update its docstring to reflect that the function
now gates observe.RecordError (not otlp.RecordError) and describe the behavior
accurately (i.e., it skips calling observe.RecordError when the predicate
returns true while still returning the original error; multiple options compose
with OR semantics). Locate the SkipErrorRecordingIf declaration and replace
references to otlp.RecordError with observe.RecordError, ensuring the
description matches the current implementation and mentions that the error is
returned unchanged.

101-125: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Record latency on failed executions too.

The early return on err skips both histogram.Record(...) and the latency span attribute, so failed operations disappear from this latency signal entirely.

Suggested fix
 func TraceWithMetric[RET any](
 	ctx context.Context,
 	operationName string,
 	tracer trace.Tracer,
 	histogram metric.Int64Histogram,
 	fn func(ctx context.Context) (RET, error),
 	opts ...TraceOption,
 ) (RET, error) {
 	var zeroRet RET
 
 	return Trace(ctx, tracer, operationName, func(ctx context.Context) (RET, error) {
-		now := time.Now()
+		startedAt := time.Now()
+		defer func() {
+			latency := time.Since(startedAt)
+			histogram.Record(ctx, latency.Milliseconds())
+			trace.SpanFromContext(ctx).SetAttributes(attribute.String("latency", latency.String()))
+		}()
+
 		ret, err := fn(ctx)
 		if err != nil {
 			// Error is already recorded by Trace(), no need to record it here
 			return zeroRet, err
 		}
-
-		latency := time.Since(now)
-		histogram.Record(ctx, latency.Milliseconds())
-		trace.SpanFromContext(ctx).SetAttributes(attribute.String("latency", latency.String()))
 
 		return ret, nil
 	}, opts...)
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/tracing/tracing.go` around lines 101 - 125, In TraceWithMetric, the
current early return on error prevents recording latency and span attributes for
failed executions; change the flow so latency is captured regardless of fn's
error by capturing start time before calling fn and using a post-call block
(e.g., defer or explicit after-call code) that computes latency, calls
histogram.Record(ctx, latency.Milliseconds()), and sets
trace.SpanFromContext(ctx).SetAttributes(attribute.String("latency",
latency.String())), then return the original ret and err; update the body of
TraceWithMetric (the anonymous wrapper passed to Trace) to perform recording
unconditionally while preserving the existing returned values.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@internal/storage/ledger/transactions.go`:
- Around line 140-143: The trace attribute for transaction ID is recording the
pointer address because the code uses fmt.Sprint(tx.ID); change it to serialize
the actual ID value (e.g., dereference tx.ID or convert the underlying
integer/string value) before passing to attribute.String so
trace.SpanFromContext(ctx).SetAttributes receives the real ID; update the
statement that builds attribute.String("transaction.id", ...) to use the
concrete value (for example fmt.Sprint(*tx.ID) or strconv.FormatInt(*tx.ID, 10)
depending on tx.ID's type) alongside the existing transaction.timestamp and
transaction.reference fields.

In `@internal/tracing/tracing.go`:
- Around line 157-158: The exported comment for SkipErrorRecordingIf is
stale—update its docstring to reflect that the function now gates
observe.RecordError (not otlp.RecordError) and describe the behavior accurately
(i.e., it skips calling observe.RecordError when the predicate returns true
while still returning the original error; multiple options compose with OR
semantics). Locate the SkipErrorRecordingIf declaration and replace references
to otlp.RecordError with observe.RecordError, ensuring the description matches
the current implementation and mentions that the error is returned unchanged.
- Around line 101-125: In TraceWithMetric, the current early return on error
prevents recording latency and span attributes for failed executions; change the
flow so latency is captured regardless of fn's error by capturing start time
before calling fn and using a post-call block (e.g., defer or explicit
after-call code) that computes latency, calls histogram.Record(ctx,
latency.Milliseconds()), and sets
trace.SpanFromContext(ctx).SetAttributes(attribute.String("latency",
latency.String())), then return the original ret and err; update the body of
TraceWithMetric (the anonymous wrapper passed to Trace) to perform recording
unconditionally while preserving the existing returned values.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 4b86c749-b4df-49c0-9cb5-d2ceffaea76d

📥 Commits

Reviewing files that changed from the base of the PR and between 229bc5e and a51e3de.

📒 Files selected for processing (4)
  • internal/controller/ledger/controller_default.go
  • internal/storage/ledger/logs.go
  • internal/storage/ledger/transactions.go
  • internal/tracing/tracing.go

@flemzord flemzord closed this Jun 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants