Background
While designing observability for runkit I reviewed enginekit/telemetry in detail against what we need. Recorder.Info and Recorder.Error make one opinionated choice that is wrong for our use case -- and is probably wrong for persistencekit too.
Problem
Info and Error emit span events
Recorder.Info and Recorder.Error both call span.AddEvent(event) on the active span in addition to emitting an OTel log record.
This is unnecessary coupling. Log records already carry trace_id and span_id as first-class OTel log fields, which is sufficient for log-trace correlation in any compliant backend. Span events serve a different purpose -- annotating a span's timeline -- and that decision belongs to the caller, not the logger.
In practice, persistencekit calls Recorder.Info on every successful operation (e.g. journal.get.ok). Under load that's a large volume of span events that add no value.
Fix: remove span event emission from Info and Error entirely.
What to keep
- Auto-counters on
StartSpan (operations, operations_in_flight) -- these are sampling-independent, which means accurate operation throughput metrics at any trace sample rate. Worth keeping.
Error marking the active span (span.RecordError + span.SetStatus(codes.Error)) -- correct and desirable. If you're logging an error, the span should be marked as errored.
- Auto error counter on
Error -- same reasoning as operation counters.
- Everything else in the package (
Provider, Attr constructors, Instrument, NewSLogProvider, NewTestProvider) -- all fine as-is.
Impact on persistencekit
persistencekit uses Recorder.StartSpan, Info, Error, and metric instruments. The only behavioural change it would see is Info no longer emitting span events -- which is a net improvement.
Background
While designing observability for runkit I reviewed
enginekit/telemetryin detail against what we need.Recorder.InfoandRecorder.Errormake one opinionated choice that is wrong for our use case -- and is probably wrong forpersistencekittoo.Problem
InfoandErroremit span eventsRecorder.InfoandRecorder.Errorboth callspan.AddEvent(event)on the active span in addition to emitting an OTel log record.This is unnecessary coupling. Log records already carry
trace_idandspan_idas first-class OTel log fields, which is sufficient for log-trace correlation in any compliant backend. Span events serve a different purpose -- annotating a span's timeline -- and that decision belongs to the caller, not the logger.In practice,
persistencekitcallsRecorder.Infoon every successful operation (e.g.journal.get.ok). Under load that's a large volume of span events that add no value.Fix: remove span event emission from
InfoandErrorentirely.What to keep
StartSpan(operations,operations_in_flight) -- these are sampling-independent, which means accurate operation throughput metrics at any trace sample rate. Worth keeping.Errormarking the active span (span.RecordError+span.SetStatus(codes.Error)) -- correct and desirable. If you're logging an error, the span should be marked as errored.Error-- same reasoning as operation counters.Provider,Attrconstructors,Instrument,NewSLogProvider,NewTestProvider) -- all fine as-is.Impact on persistencekit
persistencekitusesRecorder.StartSpan,Info,Error, and metric instruments. The only behavioural change it would see isInfono longer emitting span events -- which is a net improvement.