feat(tracing): skip OTLP error recording for selected errors#1317
Conversation
Add TraceOption with SkipErrorRecordingIf and WithSpanStartOptions so Trace and TraceWithMetric can omit otlp.RecordError for expected cases (e.g. not found on ReadLogWithIdempotencyKey). Migrate ImportLog span options to WithSpanStartOptions. Inline the InsertTransaction metric finalizer into the traced callback because TraceWithMetric now takes trace options as its trailing variadic. Made-with: Cursor Signed-off-by: Sylvain Rabot <sylvain@formance.com>
WalkthroughRefactors tracing APIs to accept TraceOption objects. Introduces WithSpanStartOptions and SkipErrorRecordingIf options, conditions error recording on predicates, and updates call sites to pass span start options and skip-not-found behavior; removes the finalizer callback pattern in TraceWithMetric. ChangesTracing changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.12.2)level=error msg="[linters_context] typechecking error: pattern ./...: directory prefix . does not contain main module or its selected dependencies" Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
internal/tracing/tracing.go (1)
127-163: Well-designed TraceOption API with clean extensibility.The functional options pattern is idiomatic Go, the OR semantics for multiple predicates is clearly documented, and the implementation correctly returns errors to callers unchanged while only affecting OTLP recording.
One optional defensive improvement: consider guarding against nil predicates to prevent a panic if a caller accidentally passes
nil.🛡️ Optional: guard against nil predicate
func SkipErrorRecordingIf(predicate func(error) bool) TraceOption { + if predicate == nil { + return traceOptionFunc(func(*traceSettings) {}) // no-op + } return traceOptionFunc(func(s *traceSettings) { s.skipRecordErrorIfs = append(s.skipRecordErrorIfs, predicate) }) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/tracing/tracing.go` around lines 127 - 163, Guard against nil predicates to avoid panics: in SkipErrorRecordingIf, ignore any nil predicate arguments instead of appending them to traceSettings.skipRecordErrorIfs; additionally make shouldSkipErrorRecording resilient by checking for nil before calling predicate (i.e., skip nil entries). Update references in SkipErrorRecordingIf and traceSettings.shouldSkipErrorRecording to perform these nil checks so a mistakenly passed nil predicate won’t cause a runtime panic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@internal/tracing/tracing.go`:
- Around line 127-163: Guard against nil predicates to avoid panics: in
SkipErrorRecordingIf, ignore any nil predicate arguments instead of appending
them to traceSettings.skipRecordErrorIfs; additionally make
shouldSkipErrorRecording resilient by checking for nil before calling predicate
(i.e., skip nil entries). Update references in SkipErrorRecordingIf and
traceSettings.shouldSkipErrorRecording to perform these nil checks so a
mistakenly passed nil predicate won’t cause a runtime panic.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: ed6a493d-1ea4-451e-90d6-2e2a61ba2042
📒 Files selected for processing (4)
internal/controller/ledger/controller_default.gointernal/storage/ledger/logs.gointernal/storage/ledger/transactions.gointernal/tracing/tracing.go
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1317 +/- ##
==========================================
- Coverage 80.81% 78.93% -1.88%
==========================================
Files 205 205
Lines 11131 11265 +134
==========================================
- Hits 8995 8892 -103
- Misses 1594 1720 +126
- Partials 542 653 +111 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
internal/storage/ledger/transactions.go (1)
140-143:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winUse the actual transaction ID value for trace attributes.
On Line 141,
fmt.Sprint(tx.ID)serializes the pointer, not the ID value. This records incorrect trace data fortransaction.id.Suggested fix
+import "strconv" ... - trace.SpanFromContext(ctx).SetAttributes( - attribute.String("transaction.id", fmt.Sprint(tx.ID)), + txID := "" + if tx.ID != nil { + txID = strconv.FormatUint(*tx.ID, 10) + } + trace.SpanFromContext(ctx).SetAttributes( + attribute.String("transaction.id", txID), attribute.String("transaction.timestamp", tx.Timestamp.Format(time.RFC3339Nano)), attribute.String("transaction.reference", tx.Reference), )🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/storage/ledger/transactions.go` around lines 140 - 143, The trace attribute for transaction ID is recording the pointer address because the code uses fmt.Sprint(tx.ID); change it to serialize the actual ID value (e.g., dereference tx.ID or convert the underlying integer/string value) before passing to attribute.String so trace.SpanFromContext(ctx).SetAttributes receives the real ID; update the statement that builds attribute.String("transaction.id", ...) to use the concrete value (for example fmt.Sprint(*tx.ID) or strconv.FormatInt(*tx.ID, 10) depending on tx.ID's type) alongside the existing transaction.timestamp and transaction.reference fields.internal/tracing/tracing.go (2)
157-158:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winUpdate the exported comment to match the implementation.
SkipErrorRecordingIfnow gatesobserve.RecordError, nototlp.RecordError, so the current doc text is stale and misleading.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/tracing/tracing.go` around lines 157 - 158, The exported comment for SkipErrorRecordingIf is stale—update its docstring to reflect that the function now gates observe.RecordError (not otlp.RecordError) and describe the behavior accurately (i.e., it skips calling observe.RecordError when the predicate returns true while still returning the original error; multiple options compose with OR semantics). Locate the SkipErrorRecordingIf declaration and replace references to otlp.RecordError with observe.RecordError, ensuring the description matches the current implementation and mentions that the error is returned unchanged.
101-125:⚠️ Potential issue | 🟠 Major | ⚡ Quick winRecord latency on failed executions too.
The early return on
errskips bothhistogram.Record(...)and thelatencyspan attribute, so failed operations disappear from this latency signal entirely.Suggested fix
func TraceWithMetric[RET any]( ctx context.Context, operationName string, tracer trace.Tracer, histogram metric.Int64Histogram, fn func(ctx context.Context) (RET, error), opts ...TraceOption, ) (RET, error) { var zeroRet RET return Trace(ctx, tracer, operationName, func(ctx context.Context) (RET, error) { - now := time.Now() + startedAt := time.Now() + defer func() { + latency := time.Since(startedAt) + histogram.Record(ctx, latency.Milliseconds()) + trace.SpanFromContext(ctx).SetAttributes(attribute.String("latency", latency.String())) + }() + ret, err := fn(ctx) if err != nil { // Error is already recorded by Trace(), no need to record it here return zeroRet, err } - - latency := time.Since(now) - histogram.Record(ctx, latency.Milliseconds()) - trace.SpanFromContext(ctx).SetAttributes(attribute.String("latency", latency.String())) return ret, nil }, opts...) }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/tracing/tracing.go` around lines 101 - 125, In TraceWithMetric, the current early return on error prevents recording latency and span attributes for failed executions; change the flow so latency is captured regardless of fn's error by capturing start time before calling fn and using a post-call block (e.g., defer or explicit after-call code) that computes latency, calls histogram.Record(ctx, latency.Milliseconds()), and sets trace.SpanFromContext(ctx).SetAttributes(attribute.String("latency", latency.String())), then return the original ret and err; update the body of TraceWithMetric (the anonymous wrapper passed to Trace) to perform recording unconditionally while preserving the existing returned values.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@internal/storage/ledger/transactions.go`:
- Around line 140-143: The trace attribute for transaction ID is recording the
pointer address because the code uses fmt.Sprint(tx.ID); change it to serialize
the actual ID value (e.g., dereference tx.ID or convert the underlying
integer/string value) before passing to attribute.String so
trace.SpanFromContext(ctx).SetAttributes receives the real ID; update the
statement that builds attribute.String("transaction.id", ...) to use the
concrete value (for example fmt.Sprint(*tx.ID) or strconv.FormatInt(*tx.ID, 10)
depending on tx.ID's type) alongside the existing transaction.timestamp and
transaction.reference fields.
In `@internal/tracing/tracing.go`:
- Around line 157-158: The exported comment for SkipErrorRecordingIf is
stale—update its docstring to reflect that the function now gates
observe.RecordError (not otlp.RecordError) and describe the behavior accurately
(i.e., it skips calling observe.RecordError when the predicate returns true
while still returning the original error; multiple options compose with OR
semantics). Locate the SkipErrorRecordingIf declaration and replace references
to otlp.RecordError with observe.RecordError, ensuring the description matches
the current implementation and mentions that the error is returned unchanged.
- Around line 101-125: In TraceWithMetric, the current early return on error
prevents recording latency and span attributes for failed executions; change the
flow so latency is captured regardless of fn's error by capturing start time
before calling fn and using a post-call block (e.g., defer or explicit
after-call code) that computes latency, calls histogram.Record(ctx,
latency.Milliseconds()), and sets
trace.SpanFromContext(ctx).SetAttributes(attribute.String("latency",
latency.String())), then return the original ret and err; update the body of
TraceWithMetric (the anonymous wrapper passed to Trace) to perform recording
unconditionally while preserving the existing returned values.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 4b86c749-b4df-49c0-9cb5-d2ceffaea76d
📒 Files selected for processing (4)
internal/controller/ledger/controller_default.gointernal/storage/ledger/logs.gointernal/storage/ledger/transactions.gointernal/tracing/tracing.go
Add TraceOption with SkipErrorRecordingIf and WithSpanStartOptions so Trace and TraceWithMetric can omit otlp.RecordError for expected cases (e.g. "not found" on ReadLogWithIdempotencyKey).
Migrate ImportLog span options to WithSpanStartOptions. Inline the InsertTransaction metric finalizer into the traced callback because TraceWithMetric now takes trace options as its trailing variadic.
Made-with: Cursor