Skip to content

NikiforovAll/otel-elastic-example

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OTel + Elastic APM — Sampling Done Right (.NET)

A minimal, self-contained example showing how to correctly configure OpenTelemetry sampling in .NET so that Elastic APM computes accurate throughput and latency metrics. As a bonus, it also demonstrates dual-export to Azure Monitor (App Insights) alongside Elastic.

The Problem

Missing Sampling Probability in Tracestate

.NET's built-in TraceIdRatioBasedSampler drops traces correctly but does not propagate sampling probability via tracestate. Elastic APM reads representative_count exclusively from the tracestate header (ot=p:<value>). Without it, throughput and latency metrics are wrong — underreported by a factor of 1/ratio.

For example, with a 20% sample rate (ratio=0.2), Elastic APM should report representative_count=5 (each sampled trace represents 5 actual traces). Without the ot=p:2 entry in tracestate, it reports representative_count=1, making throughput appear 5x lower than reality.

Bonus: Azure Monitor Silently Overrides Your Sampler

If you also export to App Insights, AddAzureMonitorTraceExporter() registers a deferred builder that calls SetSampler() with its own RateLimitedSampler. This silently replaces whatever sampler you configured — including the ConsistentProbabilitySampler wrapper needed for Elastic APM.

Solution

This example demonstrates two patterns:

  1. ConsistentProbabilitySampler — wraps the standard TraceIdRatioBasedSampler and injects ot=p:<value> into tracestate, enabling Elastic APM to compute accurate representative_count.

  2. Deferred sampler re-application (for dual-export scenarios) — uses IDeferredTracerProviderBuilder to re-apply the correct sampler after Azure Monitor's deferred SetSampler override.

Architecture

---
config:
  theme: default
  flowchart:
    nodeSpacing: 20
    rankSpacing: 40
    padding: 10
    wrappingWidth: 150
---
graph TB
    PB[ParentBasedSampler] --> CP[ConsistentProbabilitySampler]
    CP -->|"ot=p:N → tracestate"| TR[TraceIdRatioBasedSampler]
    TR --> OTLP[OTLP Exporter]
    TR --> AzMon[Azure Monitor Exporter]
    OTLP --> Elastic[Elastic APM]
    AzMon --> AppInsights[App Insights]

    style PB fill:#f5f5f5,stroke:#333
    style CP fill:#ffe0b2,stroke:#e65100
    style TR fill:#f5f5f5,stroke:#333
    style Elastic fill:#00bfb3,color:#000
    style AppInsights fill:#0078d4,color:#fff
Loading

Quick Start

Prerequisites

Run

dotnet run apphost.cs

This starts the Aspire dashboard and the example API. Open the dashboard URL printed in the console.

Test

# Simple endpoint
curl http://localhost:5000/

# Endpoint with custom spans, random delays, occasional errors
curl http://localhost:5000/work

Configuration

All settings are configurable via environment variables:

Variable Default Description
OTEL_SERVICE_NAME otel-example-api Service name in traces
OTEL_EXPORTER_OTLP_ENDPOINT http://localhost:4318 OTLP collector endpoint
OTEL_TRACES_SAMPLER parentbased_traceidratio Sampler type
OTEL_TRACES_SAMPLER_ARG 0.5 Sampling ratio (0.0–1.0)
APPLICATIONINSIGHTS_CONNECTION_STRING (not set) Set to enable dual-export to App Insights

Verification

  1. Aspire Dashboard: Check traces with nested spans after hitting /work
  2. Elastic APM (if OTLP collector running):
    • tracestate contains ot=p:1 (for 0.5 ratio, since p = -log2(0.5) = 1)
    • transaction.representative_count = 2 (since 2^1 = 2)
  3. App Insights (if connection string set): Verify traces appear in both destinations
  4. Startup logs: Active OTel sampler: ParentBasedSampler, root: ConsistentProbabilitySampler, ratio: ...

References

Suggestions

This example can be extended with additional instrumentation as needed:

  • Hangfire background job tracing with context propagation
  • EntityFramework / Npgsql database instrumentation
  • FusionCache / Elasticsearch client instrumentation
  • Custom activity wrappers for domain-specific spans

About

Minimal .NET example showing correct OpenTelemetry sampling config for Elastic APM (EDOT) with accurate throughput/latency metrics, plus dual-export to Azure Monitor

Topics

Resources

Stars

Watchers

Forks

Contributors

Languages