Update client-side stats to use light weight Hashtable by dougqh · Pull Request #11382 · DataDog/dd-trace-java

dougqh · 2026-05-15T19:11:06Z

What Does This Do

Replaces the MetricKey based HashMap with a new AggregateTable based on the light Hashtable

Motivation

By using the light Hashtable, I'm able to avoid the biggest source of allocation in client-side stats: MetricKey

Hashtable provides utilities for searching the entries without constructing a new composite key object

First, the components are hashed together to find the corresponding bucket
Then the bucket can be traversed to see if the entries match the key components

Additionally, any amount of data / metadata can be stored in the entry as well

The end result is that both MetricKey and AggregateMetrics can be merged into a single class AggregateEntry that is only constructed when there's no existing matching entry

Additional Notes

Stacked on top of #11409 (dougqh/util-hashtable) — review that first; the diff shown here is only the work that's new beyond that PR.

Restructures the consumer-side aggregate store. Three logical commits, intended to be reviewed in order:

1. Add `AggregateTable` + `AggregateEntry` backed by `Hashtable`

Introduces a multi-key hash table that lets the consumer thread look up the {labels → counters} entry directly from a SpanSnapshot's raw fields — no MetricKey allocation per snapshot, no per-snapshot UTF8 cache lookups, no CHM operations. Hot-path lookup is keyHash compute → Hashtable.Support.bucket → bucket walk → matches(keyHash, snapshot) → returned entry has the counters to mutate in place.

This commit is standalone — no call sites yet, only the new classes + unit tests for hit/miss/cap-overrun/expunge/clear behavior.

2. Swap `Aggregator` to use `AggregateTable` + route `disable()` clear through a `ClearSignal`

Replaces LRUCache<MetricKey, AggregateMetric> with AggregateTable in Aggregator. Drops the AggregateExpiry listener — drop reporting (onStatsAggregateDropped) moves to the cap-overrun path inside Drainer.accept.

Threading fix bundled here: ConflatingMetricsAggregator.disable() used to call aggregator.clearAggregates() and inbox.clear() directly from the Sink's IO callback thread, racing with the aggregator thread. That race was tolerable for LinkedHashMap (worst case = corrupted internal state right before everything got cleared anyway); it's not tolerable for Hashtable (chain corruption can NPE or loop). disable() now offers a ClearSignal to the inbox so the aggregator thread itself performs the clear — preserves the single-writer invariant for AggregateTable end-to-end. The offer is best-effort; the system self-heals on a subsequent downgrade cycle if the inbox happens to be full (commented at the call site).

Cap-overrun semantic change: the old LRUCache evicted least-recently-used in O(1). AggregateTable instead scans for a hitCount==0 entry to recycle (O(N) worst-case), and drops the new key if none exists. Practical impact: in steady state, an unrelated burst of new keys gets dropped (and reported via onStatsAggregateDropped) rather than evicting established keys. The cost trade-off is commented at the eviction site — eviction is expected rare because the cap is sized to the working set; cursor-caching is the future option if a workload runs persistently at cap. The existing test that asserted "service0 evicted in favor of service10" is updated to assert the new semantics; the other cap-related test ("evicted entry was already flushed") still passes unchanged.

3. Fold `MetricKey` + `AggregateMetric` into `AggregateEntry`

MetricKey existed for two reasons — being the LRUCache key (replaced by AggregateTable's Hashtable mechanics) and being the labels arg to MetricWriter.add (the only thing left). AggregateMetric was the counter/histogram counterpart. Folds both onto a single AggregateEntry (10 UTF8 label fields + 3 primitives + counters + histograms), changes MetricWriter.add(MetricKey, AggregateMetric) → add(AggregateEntry), and deletes MetricKey.java + MetricKeys.java + AggregateMetric.java.

The 12 UTF8 caches that used to be split between MetricKey (9) and ConflatingMetricsAggregator (3, with overlap) are consolidated on AggregateEntry. One cache per field type now.

Latent bug fix: the prior matches(SpanSnapshot) used Objects.equals on raw fields. If the same logical key was delivered once as String and once as UTF8BytesString (different CharSequence impls of identical content), Objects.equals returned false and the table would split into two entries for the same key. The new matches uses content-equality (UTF8BytesString.toString() returns the underlying String in O(1)), collapsing them correctly.

Test impact: AggregateEntry.of(...) mirrors the prior new MetricKey(...) positional args, so test diffs are mostly mechanical. About 56 test sites migrated across ConflatingMetricAggregatorTest, SerializingMetricWriterTest, and MetricsIntegrationTest.

Review polish

Follow-up commits address review feedback:

Use Hashtable.Support.create(maxAggregates, Support.MAX_RATIO) + Support.bucket + Support.insertHeadEntry(buckets, keyHash, entry) + Support.mutatingTableIterator to delegate to the helpers added on Add Hashtable and LongHashingUtils utilities #11409 — drops ~50 lines of bespoke bucket-array code.
Inline the report-time forEach lambda instead of a static BiConsumer constant (the JIT reuses non-capturing lambdas).
AggregateEntry.matches(long keyHash, SpanSnapshot) overload that pre-checks the hash, so chain walks read as one call.
@Nullable (javax.annotation) annotations on the four nullable label fields + their getters + of(...) parameters.
Objects.equals import in AggregateEntry.equals() (no more fully-qualified refs).
Design-trade-off comments on evictOneStale (O(N) scan rationale) and disable() (best-effort offer rationale).

Benchmarks

2 forks × 5 iter × 15s, producer publish() latency:

	Prior commit (stacked base)	This PR
SimpleSpan bench	3.116 µs/op	3.123 µs/op
DDSpan bench	2.506 µs/op	2.412 µs/op

All within noise — this PR is a consumer-side refactor, so producer publish() shouldn't move much. The win is structural (one less class, no per-miss MetricKey allocation, no double-cache lookups, smaller per-entry footprint) plus higher consumer throughput that lets the inbox keep up at higher sustained producer rates before onStatsInboxFull fires.

Net code delta: +1280 / −903 = +377 lines across 16 files. The growth is dominated by new test coverage (AggregateTableTest, AggregateEntryTest) plus the consolidated UTF8 caches landing on AggregateEntry; the production-code core (less MetricKey + MetricKeys + AggregateMetric minus AggregateEntry's additions) is roughly flat.

Test plan

./gradlew :dd-trace-core:test --tests 'datadog.trace.common.metrics.*' passes (incl. the new AggregateTableTest and AggregateEntryTest)
./gradlew :dd-trace-core:compileJava :dd-trace-core:compileTestGroovy :dd-trace-core:compileJmhJava :dd-trace-core:compileTraceAgentTestGroovy all green
./gradlew spotlessCheck clean
CI muzzle / integration suites
Validate stats.dropped_aggregates semantics at high cardinality (especially the new "drop new on cap overrun" path vs. the old "evict LRU" path)

🤖 Generated with Claude Code

Standalone classes for swapping the consumer-side LRUCache<MetricKey, AggregateMetric> with a multi-key Hashtable in the next commit. No call sites use them yet. - AggregateEntry extends Hashtable.Entry, holds the canonical MetricKey, the mutable AggregateMetric, and copies of the 13 raw SpanSnapshot fields for matches(). The 64-bit lookup hash is computed via chained LongHashingUtils.addToHash calls (no varargs, no boxing of short/boolean). - AggregateTable wraps a Hashtable.Entry[] from Hashtable.Support.create. findOrInsert(SpanSnapshot) walks the bucket comparing raw fields, falling back to MetricKeys.fromSnapshot on a true miss. On cap overrun, it scans for an entry with hitCount==0 and unlinks it; if none, it returns null and the caller drops the data point. - MetricKeys.fromSnapshot extracts the canonicalization logic (DDCache lookups + UTF8 encoding) from Aggregator.buildMetricKey, so the helper can be called from AggregateTable on miss. This also commits Hashtable and LongHashingUtils (added earlier, previously uncommitted) and lifts Hashtable.Entry / Hashtable.Support visibility so client code outside datadog.trace.util can build higher-arity tables -- the case the javadoc describes but the original visibility didn't actually support. Specifically: Entry is now public abstract with a protected ctor; keyHash, next(), and setNext() are public; Support's create / clear / bucketIndex / bucketIterator / mutatingBucketIterator methods are public. Tests: AggregateTableTest covers hit, miss, distinct-by-spanKind, peer-tag identity (including null vs non-null), cap overrun with stale victim, cap overrun with no victim (returns null), expungeStaleAggregates, forEach, clear, and that the canonical MetricKey is built at insert. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Replace LRUCache<MetricKey, AggregateMetric> with the AggregateTable added in the prior commit. The hot path in Drainer.accept becomes: AggregateMetric aggregate = aggregates.findOrInsert(snapshot); if (aggregate != null) { aggregate.recordOneDuration(snapshot.tagAndDuration); dirty = true; } else { healthMetrics.onStatsAggregateDropped(); } On the steady-state hit path the lookup is a 64-bit hash compute + bucket walk + matches(snapshot) -- no MetricKey allocation, no SERVICE_NAMES / SPAN_KINDS / PEER_TAGS_CACHE lookups. The canonical MetricKey is now built once per unique key at insert time, in MetricKeys.fromSnapshot. Behavioral change in the cap-overrun path ----------------------------------------- The old LRUCache evicted least-recently-used: at cap, a new insert would push out the oldest entry regardless of whether it was live or stale. AggregateTable instead scans for a hitCount==0 entry to recycle, and drops the new key if none exists. Practical impact: in the common case where the table holds a stable set of recurring keys, an unrelated burst of new keys is dropped (and reported via onStatsAggregateDropped) rather than evicting the established keys. The existing test that asserted "service0 evicted in favor of service10" is updated to assert the new semantics. The other cap-related test ("should not report dropped aggregate when evicted entry was already flushed") still passes unchanged: after report() clears all entries to hitCount=0, the next wave of inserts recycles them. Threading fix ------------- ConflatingMetricsAggregator.disable() used to call aggregator.clearAggregates() and inbox.clear() directly from the Sink's IO event thread, racing with the aggregator thread mid-write. The race was tolerable for LinkedHashMap; it is not for AggregateTable (chain corruption can NPE or loop). disable() now offers a ClearSignal to the inbox so the aggregator thread itself performs the table clear and the inbox.clear(). Adds one SignalItem subclass + one branch in Drainer.accept; preserves the single-writer invariant for AggregateTable end-to-end. Removed: LRUCache import, AggregateExpiry inner class, the static buildMetricKey / materializePeerTags / encodePeerTag helpers (now in MetricKeys). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

MetricKey existed for two reasons -- the prior LRUCache key role (now handled by AggregateTable's Hashtable.Entry mechanics) and as the labels argument to MetricWriter.add. The first is gone; the second is the only thing keeping MetricKey alive. Fold its UTF8-encoded label fields onto AggregateEntry, change MetricWriter.add to take AggregateEntry directly, and delete MetricKey + MetricKeys. What AggregateEntry now holds ----------------------------- - 10 UTF8BytesString label fields (resource, service, operationName, serviceSource, type, spanKind, httpMethod, httpEndpoint, grpcStatusCode, and a List<UTF8BytesString> peerTags for serialization). - 3 primitives (httpStatusCode, synthetic, traceRoot). - AggregateMetric (the value being accumulated). - The raw String[] peerTagPairs is retained alongside the encoded peerTags -- matches() compares it positionally against the snapshot's pairs; the encoded form is only consumed by the writer. matches(SpanSnapshot) compares the entry's UTF8 forms to the snapshot's raw String / CharSequence fields via content-equality (UTF8BytesString.toString() returns the underlying String in O(1)). This closes a latent bug in the prior raw-vs-raw matches(): if one snapshot delivered a tag value as String and a later snapshot delivered the same content as UTF8BytesString, the old Objects.equals would return false and the table would split into two entries. Content-equality matching collapses them into one. Consolidated caches ------------------- The static UTF8 caches that used to live partly on MetricKey (RESOURCE_CACHE, OPERATION_CACHE, SERVICE_SOURCE_CACHE, TYPE_CACHE, KIND_CACHE, HTTP_METHOD_CACHE, HTTP_ENDPOINT_CACHE, GRPC_STATUS_CODE_CACHE, SERVICE_CACHE) and partly on ConflatingMetricsAggregator (SERVICE_NAMES, SPAN_KINDS, PEER_TAGS_CACHE) are all now on AggregateEntry. The split was duplicating work -- SERVICE_NAMES and SERVICE_CACHE both cached service-name to UTF8BytesString. One cache per field now. API change: MetricWriter.add ---------------------------- Was: add(MetricKey key, AggregateMetric aggregate) Now: add(AggregateEntry entry) The aggregate lives on the entry. Single-arg. SerializingMetricWriter reads the same UTF8 fields off AggregateEntry that it previously read off MetricKey; the wire format is byte-identical. Test impact ----------- AggregateEntry.of(...) takes the same 13 positional args new MetricKey(...) took, so test diffs are mostly mechanical: new MetricKey(args) -> AggregateEntry.of(args) writer.add(key, _) -> writer.add(entry) ValidatingSink in SerializingMetricWriterTest now iterates List<AggregateEntry> directly. ConflatingMetricAggregatorTest's Spock matchers (~36 sites) rely on AggregateEntry.equals comparing the 13 label fields (not the aggregate) so the mock matches by labels regardless of the aggregate state at call time; post-invocation closures verify aggregate state. Benchmarks (2 forks x 5 iter x 15s) ----------------------------------- The change is consumer-thread only; producer publish() is unchanged. SimpleSpan bench: 3.123 +- 0.025 us/op (prior: 3.119 +- 0.018) DDSpan bench: 2.412 +- 0.022 us/op (prior: 2.463 +- 0.041) Both within noise -- the win is structural (one less class, one less allocation per miss, one fewer cache layer) rather than benchmarked. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The label fields and the mutable counters/histograms are 1:1 with each entry; carrying them on a separate object meant one extra allocation per unique key plus an indirection on every hot-path update. Merging them puts the counters directly on AggregateEntry, drops the entry.aggregate hop, and consolidates ERROR_TAG / TOP_LEVEL_TAG onto the same class the consumer uses to decode them. AggregateTable.findOrInsert now returns AggregateEntry. Callers in Aggregator and SerializingMetricWriter updated. Migrated AggregateMetricTest.groovy to AggregateEntryTest.java per project policy. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Add a context-passing forEach(T, BiConsumer) overload to AggregateTable, mirroring TagMap's pattern. Aggregator.report now hands the writer in as context to a static BiConsumer so no fresh Consumer is allocated each report cycle. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

datadog-datadog-prod-us1-2 · 2026-05-19T17:52:05Z

✨ Fix all issues with BitsAI

⚠️ Warnings

🚦 3 Pipeline jobs failed

DataDog/apm-reliability/dd-trace-java | agent_integration_tests

🔧 Fix in code (Fix with Cursor).
4 failed tests due to MissingPropertyException at MetricsIntegrationTest.groovy:42.

Run system tests | main / End-to-end #4 / spring-boot-openliberty 4

🔄 Retry job. This looks flaky and may succeed on retry.
AssertionError: assert not ['[dd.trace 2026-05-19 21:33:19:721 +0000] [dd-task-scheduler] WARN datadog.trace.agent.core.monitor.TracerHealthMetri... com.ibm.ws.webcontainer.webapp.WebAppErrorReport: SRVE0295E: Error reported: 500']

Run system tests | Check system tests success

Useful? React with 👍 / 👎

_{This comment will be updated automatically if new data arrives.

🔗 Commit SHA: d2e4477 | Docs | Datadog PR Page | Give us feedback!}

Now that Hashtable.Support exposes the parameterized forEach helpers, AggregateTable's own forEach methods can drop their duplicated loop body and the (AggregateEntry) cast. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- findOrInsert: walks via Support.bucket(buckets, keyHash) instead of Hashtable.Entry + intermediate cast; bucketIndex is only computed on the miss path now. - evictOneStale / expungeStaleAggregates: chain variables typed as AggregateEntry from the head down, leveraging Entry.next()'s generic inference, so the per-iteration getHitCount() checks drop their (AggregateEntry) cast. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

dougqh · 2026-05-19T19:53:28Z

  private static final long DEFAULT_SLEEP_MILLIS = 10;

+  /** Non-capturing -- the writer arrives via the forEach context arg. */
+  private static final BiConsumer<MetricWriter, AggregateEntry> WRITE_AND_CLEAR =


The JVM is pretty good at reusing non-capturing lambdas, I think we can forego the static member until a profile proves it necessary.

dougqh · 2026-05-19T19:54:43Z

    @Override
    public void accept(InboxItem item) {
-      if (item instanceof SignalItem) {
+      if (item == ClearSignal.CLEAR) {


ClearSignal was introduced to avoid a thread-safety issue with the prior implementation

- Constructor sizing now uses Support.MAX_RATIO_NUMERATOR / _DENOMINATOR instead of an open-coded * 4 / 3. - findOrInsert delegates the chain-head splice to Support.insertHeadEntry. - evictOneStale and expungeStaleAggregates both rewritten in terms of Support.mutatingTableIterator. Drops the bespoke head-vs-mid-chain branching that read as more complicated than the operation actually is. Net -28 lines in AggregateTable. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

dougqh · 2026-05-19T20:07:50Z

+
+  /** Unlink the first entry whose {@code getHitCount() == 0}. */
+  private boolean evictOneStale() {
+    for (MutatingTableIterator<AggregateEntry> it = Support.mutatingTableIterator(buckets);


I'd prefer to call this "iter" rather than "it"

- AggregateTable: switch to Support.create(maxAggregates, Support.MAX_RATIO) now that the load-factor scaling is a Support concern. - AggregateTable: replace open-coded "keyHash == X && matches(s)" with a new AggregateEntry.matches(long keyHash, SpanSnapshot) overload that bundles the hash gate. - AggregateTable: rename local iterator var "it" -> "iter". - Aggregator: drop WRITE_AND_CLEAR static field, inline as a non-capturing lambda; the JIT reuses non-capturing lambdas, no need for the static until a profile says otherwise. - Aggregator: comment the ClearSignal branch with the thread-safety rationale (single-writer invariant for AggregateTable). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Picks up the Support.insertHeadEntry(buckets, long keyHash, entry) overload added on the util-hashtable branch; saves the redundant Support.bucketIndex(buckets, keyHash) hop at the call site. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ntry Use javax.annotation.Nullable (the codebase's convention -- see DDSpan, TagInterceptor, ScopeContext, etc.) on the four nullable label fields (serviceSource, httpMethod, httpEndpoint, grpcStatusCode), their getters, and the corresponding parameters of AggregateEntry.of. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Support.MAX_RATIO and the scaled create(int, float) overload already convey the 75% load-factor intent at the call site -- the inline comment was duplicating their self-documentation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Style nit -- the equals() method had eight fully-qualified references to java.util.Objects.equals; add the import and drop the qualifier. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Two design-review trade-offs that won't change in this PR but should be explicit at the call sites: - AggregateTable.evictOneStale: O(N) scan per call (vs LRUCache's O(1)), acceptable because the new policy drops the *new* key on cap-overrun rather than evicting an established one -- so eviction is expected to be rare. Cursor-caching is the future optimization if a workload runs persistently at cap. - ConflatingMetricsAggregator.disable: single inbox.offer(CLEAR) is best-effort. If the inbox is full the clear is dropped, but the system self-heals (supportsMetrics() is already false, the next report-sink-rejection retries disable). Worst case is one extra cycle of stale data, not a leak. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d2e4477f78

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-05-19T23:56:39Z

-      new AggregateMetric().recordDurations(10, new AtomicLongArray(1, 1, 200, 2, 3, 4, 5, 6, 7, 8, 9))
-      )
+    def entry1 = AggregateEntry.of("resource1", "service1", "operation1", null, "sql", 0, false, true, "xyzzy", [UTF8BytesString.create("grault:quux")], null, null, null)
+    entry1.aggregate.recordDurations(5, new AtomicLongArray(2, 1, 2, 250, 4, 5))


Record durations directly on AggregateEntry

AggregateEntry does not expose an aggregate property, so calling entry1.aggregate.recordDurations(...) raises a MissingPropertyException in Groovy at runtime. This causes the integration test to fail before writer.finishBucket() and stops it from validating the sink notification path. Call recordDurations(...) on the AggregateEntry itself instead.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-19T23:56:39Z

+      )) >> { AggregateEntry e ->
+        e.getHitCount() == 1 && e.getTopLevelCount() == 1 && e.getDuration() == 100


Use argument constraints instead of stub closures for add() checks

This >> { ... } closure is now a stubbed response, not an argument matcher, so the boolean expression is ignored for this void method and no assertion is enforced on hitCount/duration contents. As a result, these tests can pass even when aggregates are wrong. Use a Spock argument constraint (e.g. writer.add({ AggregateEntry e -> ... })) to keep the validation behavior.

Useful? React with 👍 / 👎.

dougqh added type: enhancement Enhancements and improvements comp: core Tracer core tag: performance Performance related changes tag: no release notes Changes to exclude from release notes comp: metrics Metrics tag: ai generated Largely based on code generated by an AI or LLM labels May 15, 2026

This was referenced May 15, 2026

Cap metric label cardinality + hoist peer-tag schema; rename to ClientStatsAggregator #11387

Draft

Add Hashtable and LongHashingUtils utilities #11409

Open

dougqh and others added 3 commits May 18, 2026 15:18

dougqh force-pushed the dougqh/optimize-metric-key branch from 050a998 to 3738c85 Compare May 18, 2026 19:20

dougqh changed the base branch from dougqh/conflating-metrics-background-work to dougqh/util-hashtable May 18, 2026 19:21