Skip to content

chore(agent-data-plane): failing repros for six discovered bugs#1769

Draft
blt wants to merge 1 commit into
graphite-base/1769from
blt/antithesis-bug-tests
Draft

chore(agent-data-plane): failing repros for six discovered bugs#1769
blt wants to merge 1 commit into
graphite-base/1769from
blt/antithesis-bug-tests

Conversation

@blt
Copy link
Copy Markdown
Contributor

@blt blt commented May 29, 2026

Summary

Six TDD-style tests that each assert the behavior agent-data-plane should have and currently fail,
demonstrating a real defect. No production code changes — these are the failing tests a fix would
turn green. Root-cause notes for each are in test/antithesis/scratchbook/bug-ledger.md.

  • aggregate: a sub-second aggregate_window_duration truncates to a 0-second bucket and panics on
    timestamp % 0 at the first insert.
  • aggregate: a forward wall-clock jump backfills zero-value points across the whole jump
    (O(jump) work and allocation), flooding output.
  • ddsketch: a single non-finite sample silently poisons sum/avg (no finiteness guard).
  • dogstatsd replay: a corrupt length prefix is read as a clean EOF, silently dropping the records
    after it.
  • context resolver: with the default heap fallback, a full interner never refuses, so resolution is
    unbounded under high cardinality.
  • config: ready() waits for the first dynamic snapshot with no timeout, so startup hangs forever if
    it never arrives.

Change Type

  • Bug fix
  • New feature
  • Non-functional (chore, refactoring, docs)
  • Performance

How did you test this PR?

cargo nextest run on the six tests — all six fail, each for its intended reason (panic / NaN /
silent Ok(None) / point flood / unbounded resolution / ready() timeout). cargo fmt --check is
clean.

CI note: because these are intentionally-failing repros, the unit-tests jobs (make test) will be
red until the underlying bugs are fixed. Decide how to land them — e.g. #[ignore] with a
tracking issue per bug, or keep them red as known-failing demonstrations.

References

  • Root-cause analysis and the full ledger: test/antithesis/scratchbook/bug-ledger.md (research PR).
  • The harness that can exercise some of these under fault injection: the harness PR in this stack.

@dd-octo-sts dd-octo-sts Bot added area/core Core functionality, event model, etc. area/config Configuration. area/components Sources, transforms, and destinations. source/dogstatsd DogStatsD source. transform/aggregate Aggregate transform. labels May 29, 2026
Copy link
Copy Markdown
Contributor Author

blt commented May 29, 2026

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

This stack of pull requests is managed by Graphite. Learn more about stacking.

@datadog-official
Copy link
Copy Markdown

datadog-official Bot commented May 29, 2026

Pipelines

Fix all issues with BitsAI

⚠️ Warnings

🚦 10 Pipeline jobs failed

DataDog/saluki | check-clippy   View in Datadog   GitLab

🔧 Fix in code (Fix with Cursor). Compilation error in mod.rs:1121:13: expected `NonZero<u64>`, found `Duration` while initializing `AggregationState` and multiple type mismatches in subsequent lines.

DataDog/saluki | check-features-linux-amd64   View in Datadog   GitLab

🔧 Fix in code (Fix with Cursor). Compilation error in mod.rs:1121: arguments to AggregationState::new are incorrect, expected NonZero<u64>, found Duration

DataDog/saluki | check-features-linux-arm64   View in Datadog   GitLab

🔧 Fix in code (Fix with Cursor). Compilation errors due to type mismatches between `NonZero<u64>` and `Duration` in mod.rs.

View all 10 failed jobs.

Useful? React with 👍 / 👎

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 2d5932a | Docs | Datadog PR Page | Give us feedback!

@pr-commenter
Copy link
Copy Markdown

pr-commenter Bot commented May 29, 2026

Binary Size Analysis (Agent Data Plane)

Baseline: 1bd1613 · Comparison: 2d5932a · diff
Analysis Configuration: stripped binaries · Pass/Fail Threshold: +5%
Sizes: 37.89 MiB (baseline) vs 37.90 MiB (comparison)
Size Change: +928 B (+0.00%)

✅ Binary size difference within threshold

Changes by Module
Module File Size Symbols
anon.7c74fb246a9e2e0f03e4c96295421832.111.llvm.13542198569310799513 +3.91 KiB 1
anon.74ec343e7df4c5afe576d99fcc874370.67.llvm.13153431456681701250 -3.90 KiB 1
anon.40280c81039b585976f7339b6e264845.13.llvm.14039283068297146168 +2.83 KiB 1
anon.0acc57ca412442189c7678857beeb25f.1.llvm.18230564023335910389 -2.83 KiB 1
anon.69220190dd54a7096abb5416a7fc5f5b.172.llvm.3001728827928409060 -1.32 KiB 1
anon.74ec343e7df4c5afe576d99fcc874370.102.llvm.2566014890999717851 +1.32 KiB 1
anon.25642966c2ceed9fdaa18b301a53af1b.16.llvm.14142020690586136053 +1.17 KiB 1
anon.b4159849edfd01d766e0d420b2a0b147.50.llvm.6843852336371360966 +1.17 KiB 1
anon.02b5255633f97219e3c81be4a33426c5.42.llvm.4649599560089562033 -1.17 KiB 1
anon.4e5f44f61ef7fdbdae19dcf9defd016d.119.llvm.9067789454311435177 -1.17 KiB 1
anon.4e5f44f61ef7fdbdae19dcf9defd016d.117.llvm.17419904333410818379 +1.15 KiB 1
anon.4e5f44f61ef7fdbdae19dcf9defd016d.117.llvm.9067789454311435177 -1.15 KiB 1
anon.31516bb465b7b56f2793137aa936837b.12.llvm.468775393026371356 +1.15 KiB 1
anon.c5beeec9f796252eed86dd2c5032d81b.70.llvm.13773911667277771233 -1.15 KiB 1
core +1.08 KiB 1379
anon.1a6121dd844402061302c8edd1a32625.22.llvm.5047211621473384219 +1.07 KiB 1
anon.0e8b1af6fad698973aa643c8c1a4212d.30.llvm.17693784736754346409 -1.07 KiB 1
anon.83861495afd7ed2d7a5f7a2024dbca33.32.llvm.16744458082958744374 +1.06 KiB 1
anon.423de405463045f5e6382613e58502c9.623.llvm.2135413985037036474 -1.06 KiB 1
anon.02b5255633f97219e3c81be4a33426c5.448.llvm.3126449660902020086 +995 B 1
Detailed Symbol Changes
    FILE SIZE        VM SIZE    
 --------------  -------------- 
  [NEW] +3.91Ki  [NEW]     +16    anon.7c74fb246a9e2e0f03e4c96295421832.111.llvm.13542198569310799513
  [NEW] +3.44Ki  [NEW]    +456    core::ptr::drop_in_place<core::iter::adapters::map::Map<std::collections::hash::map::IntoIter<axum::routing::RouteId,axum::routing::Endpoint<saluki_components::destinations::dsd_stats::DogStatsDAPIHandlerState>>,axum::routing::path_router::PathRouter<saluki_components::destinations::dsd_stats::DogStatsDAPIHandlerState,_>::with_state<$LP$$RP$>::{{closure}}>>::h5a147bb683008716
  [NEW] +2.83Ki  [NEW]      +2    anon.40280c81039b585976f7339b6e264845.13.llvm.14039283068297146168
  [NEW] +2.81Ki  [NEW]     +47    _<http_body_util::combinators::map_err::MapErr<B,F> as http_body::Body>::size_hint::ha191b168a1fa9981
  [NEW] +2.73Ki  [NEW]     +61    core::ptr::drop_in_place<std::sync::poison::PoisonError<std::sync::poison::rwlock::RwLockReadGuard<quick_cache::shard::CacheShard<saluki_context::hash::ContextKey,saluki_context::context::Context,saluki_common::cache::weight::WrappedWeighter<saluki_common::cache::weight::ItemCountWeighter>,saluki_common::hash::NoopU64BuildHasher,saluki_common::cache::expiry::ExpiryCapableLifecycle<saluki_context::hash::ContextKey>,alloc::sync::Arc<quick_cache::sync_placeholder::Placeholder<saluki_context::context::Context>>>>>>::hca7ee00be547c4fc
  [NEW] +2.17Ki  [NEW]    +129    core::ptr::drop_in_place<std::sync::poison::PoisonError<std::sync::poison::rwlock::RwLockWriteGuard<quick_cache::shard::CacheShard<alloc::string::String,saluki_components::sources::otlp::metrics::cache::Extrema,saluki_common::cache::weight::WrappedWeighter<saluki_common::cache::weight::ItemCountWeighter>,foldhash::quality::RandomState,saluki_common::cache::expiry::ExpiryCapableLifecycle<alloc::string::String>,alloc::sync::Arc<quick_cache::sync_placeholder::Placeholder<saluki_components::sources::otlp::metrics::cache::Extrema>>>>>>::h364cbbab64626c51
  [NEW] +1.56Ki  [NEW]    +703    _<http_body_util::combinators::map_err::MapErr<B,F> as http_body::Body>::poll_frame::h5c3105730e8a290e
  [NEW] +1.37Ki  [NEW]    +691    _<http_body_util::combinators::map_err::MapErr<B,F> as http_body::Body>::poll_frame::ha2f92b2804f0244b
  [NEW] +1.32Ki  [NEW]     +88    anon.74ec343e7df4c5afe576d99fcc874370.102.llvm.2566014890999717851
  [NEW] +1.29Ki  [NEW]    +315    core::ptr::drop_in_place<tokio::sync::mpsc::bounded::Permit<saluki_config::dynamic::event::ConfigUpdate>>::h06b7187777d7c744
  +0.0%    +969  [ = ]       0    [6891 Others]
  [DEL] -1.29Ki  [DEL]    -315    core::ptr::drop_in_place<tokio::sync::mpsc::bounded::Permit<saluki_env::workload::metadata::MetadataOperation>>::h2e1e385cf4a89858
  [DEL] -1.32Ki  [DEL]     -88    anon.69220190dd54a7096abb5416a7fc5f5b.172.llvm.3001728827928409060
  [DEL] -1.38Ki  [DEL]    -691    _<http_body_util::combinators::map_err::MapErr<B,F> as http_body::Body>::poll_frame::h8d3d490742ef5359
  [DEL] -1.57Ki  [DEL]    -703    _<http_body_util::combinators::map_err::MapErr<B,F> as http_body::Body>::poll_frame::h4c56918971715b8b
  [DEL] -2.17Ki  [DEL]    -129    core::ptr::drop_in_place<std::sync::poison::PoisonError<std::sync::poison::rwlock::RwLockWriteGuard<quick_cache::shard::CacheShard<stringtheory::MetaString,core::option::Option<saluki_components::transforms::dogstatsd_mapper::CachedMapResult>,saluki_common::cache::weight::WrappedWeighter<saluki_common::cache::weight::ItemCountWeighter>,foldhash::quality::RandomState,saluki_common::cache::expiry::ExpiryCapableLifecycle<stringtheory::MetaString>,alloc::sync::Arc<quick_cache::sync_placeholder::Placeholder<core::option::Option<saluki_components::transforms::dogstatsd_mapper::CachedMapResult>>>>>>>::hb84ff946adb48f71
  [DEL] -2.73Ki  [DEL]     -61    core::ptr::drop_in_place<std::sync::poison::PoisonError<std::sync::poison::rwlock::RwLockReadGuard<quick_cache::shard::CacheShard<stringtheory::MetaString,core::option::Option<saluki_components::transforms::dogstatsd_mapper::CachedMapResult>,saluki_common::cache::weight::WrappedWeighter<saluki_common::cache::weight::ItemCountWeighter>,foldhash::quality::RandomState,saluki_common::cache::expiry::ExpiryCapableLifecycle<stringtheory::MetaString>,alloc::sync::Arc<quick_cache::sync_placeholder::Placeholder<core::option::Option<saluki_components::transforms::dogstatsd_mapper::CachedMapResult>>>>>>>::hb0dd5f6df4367612
  [DEL] -2.83Ki  [DEL]      -2    anon.0acc57ca412442189c7678857beeb25f.1.llvm.18230564023335910389
  [DEL] -2.84Ki  [DEL]     -47    _<http_body_util::combinators::map_err::MapErr<B,F> as http_body::Body>::size_hint::h66dcc7212f74260d
  [DEL] -3.44Ki  [DEL]    -456    core::ptr::drop_in_place<core::iter::adapters::map::Map<std::collections::hash::map::IntoIter<axum::routing::RouteId,axum::routing::Endpoint<saluki_components::sources::dogstatsd::replay::replay_control::DogStatsDReplayControl>>,axum::routing::path_router::PathRouter<saluki_components::sources::dogstatsd::replay::replay_control::DogStatsDReplayControl,_>::with_state<$LP$$RP$>::{{closure}}>>::h9e288300660bcc12
  [DEL] -3.90Ki  [DEL]     -16    anon.74ec343e7df4c5afe576d99fcc874370.67.llvm.13153431456681701250
  +0.0%    +928  [ = ]       0    TOTAL

@blt blt changed the title test(antithesis): failing repros for six discovered agent-data-plane bugs chore(agent-data-plane): failing repros for six discovered bugs May 29, 2026
@pr-commenter
Copy link
Copy Markdown

pr-commenter Bot commented May 29, 2026

Regression Detector (Agent Data Plane)

Run ID: 878f6312-4f00-4e2d-8a2c-2f8338b0e719
Baseline: 1bd16137 · Comparison: 2d5932a2 · diff

Optimization Goals: ✅ No significant changes detected

Fine details of change detection per experiment (35)

Experiments configured erratic: true are tagged (ignored) and skipped when determining which experiments regressed or improved. Experiments which are detected as erratic at runtime are tagged (erratic) to flag that the run's sample dispersion was high, but their regression / improvement signal still counts.

experiment goal Δ mean % links
otlp_ingest_traces_ottl_filtering_5mb_cpu (erratic) cpu ⚪ +3.68 metrics profiles logs
dsd_uds_512kb_3k_contexts_cpu (erratic) cpu ⚪ +3.16 metrics profiles logs
dsd_uds_500mb_3k_contexts_throughput throughput ⚪ -2.63 metrics profiles logs
otlp_ingest_metrics_5mb_memory memory ⚪ +2.08 metrics profiles logs
dsd_uds_1mb_3k_contexts_memory memory ⚪ +0.53 metrics profiles logs
dsd_uds_10mb_3k_contexts_cpu (erratic) cpu ⚪ +0.28 metrics profiles logs
dsd_uds_500mb_3k_contexts_memory memory ⚪ +0.15 metrics profiles logs
otlp_ingest_traces_ottl_filtering_5mb_memory memory ⚪ +0.14 metrics profiles logs
quality_gates_rss_dsd_medium memory ⚪ +0.09 metrics profiles logs
quality_gates_rss_dsd_ultraheavy memory ⚪ +0.06 metrics profiles logs
otlp_ingest_traces_5mb_memory memory ⚪ +0.03 metrics profiles logs
dsd_uds_10mb_3k_contexts_throughput throughput ⚪ -0.01 metrics profiles logs
otlp_ingest_metrics_5mb_throughput throughput ⚪ -0.01 metrics profiles logs
dsd_uds_1mb_3k_contexts_throughput throughput ⚪ -0.00 metrics profiles logs
dsd_uds_100mb_3k_contexts_throughput throughput ⚪ +0.00 metrics profiles logs
dsd_uds_512kb_3k_contexts_throughput throughput ⚪ +0.01 metrics profiles logs
otlp_ingest_logs_5mb_throughput (ignored) throughput ⚪ +0.02 metrics profiles logs
dsd_uds_100mb_3k_contexts_memory memory ⚪ -0.03 metrics profiles logs
quality_gates_rss_dsd_heavy memory ⚪ -0.07 metrics profiles logs
otlp_ingest_traces_ottl_filtering_5mb_throughput throughput ⚪ +0.07 metrics profiles logs
dsd_uds_10mb_3k_contexts_memory memory ⚪ -0.09 metrics profiles logs
dsd_uds_512kb_3k_contexts_memory memory ⚪ -0.10 metrics profiles logs
quality_gates_rss_dsd_low memory ⚪ -0.19 metrics profiles logs
otlp_ingest_traces_ottl_transform_5mb_memory memory ⚪ -0.30 metrics profiles logs
quality_gates_rss_idle memory ⚪ -0.33 metrics profiles logs
otlp_ingest_traces_5mb_throughput throughput ⚪ +0.60 metrics profiles logs
otlp_ingest_logs_5mb_cpu (ignored) cpu ⚪ -0.83 metrics profiles logs
otlp_ingest_traces_ottl_transform_5mb_throughput throughput ⚪ +0.94 metrics profiles logs
otlp_ingest_metrics_5mb_cpu (erratic) cpu ⚪ -1.59 metrics profiles logs
dsd_uds_500mb_3k_contexts_cpu (erratic) cpu ⚪ -1.88 metrics profiles logs
dsd_uds_1mb_3k_contexts_cpu (erratic) cpu ⚪ -2.45 metrics profiles logs
otlp_ingest_traces_5mb_cpu (erratic) cpu ⚪ -3.90 metrics profiles logs
otlp_ingest_traces_ottl_transform_5mb_cpu (erratic) cpu ⚪ -3.98 metrics profiles logs
dsd_uds_100mb_3k_contexts_cpu (erratic) cpu ⚪ -4.12 metrics profiles logs
otlp_ingest_logs_5mb_memory (ignored) memory ⚪ -9.50 metrics profiles logs
Bounds Checks: ✅ Passed (5)
experiment check replicates observed links
quality_gates_rss_dsd_heavy memory_usage 10/10 ✅ 125 MiB ≤ 140 MiB metrics profiles logs
quality_gates_rss_dsd_low memory_usage 10/10 ✅ 39.9 MiB ≤ 50 MiB metrics profiles logs
quality_gates_rss_dsd_medium memory_usage 10/10 ✅ 60.1 MiB ≤ 75 MiB metrics profiles logs
quality_gates_rss_dsd_ultraheavy memory_usage 10/10 ✅ 185 MiB ≤ 200 MiB metrics profiles logs
quality_gates_rss_idle memory_usage 10/10 ✅ 26.6 MiB ≤ 40 MiB metrics profiles logs
Explanation

A change is flagged as a regression when |Δ mean %| > 5.00% in the regressing direction for its optimization goal AND SMP marks the experiment as a regression (is_regression: true). Improvements use the matching criteria for the improving direction. Experiments configured erratic: true (tagged (ignored)) are skipped outright; experiments detected as erratic at runtime (tagged (erratic)) still count, since that flag describes sample dispersion rather than directional certainty. The Δ mean % cell is colored accordingly: 🟢 = improvement, 🔴 = regression, ⚪ = neutral. Reduction in CPU or memory is an improvement; reduction in ingress throughput is a regression.

@blt blt force-pushed the blt/antithesis-bug-tests branch from 62e8952 to f89f1cf Compare May 29, 2026 20:04
@blt blt force-pushed the blt/antithesis-research branch 2 times, most recently from e840a73 to 8712b69 Compare May 29, 2026 20:33
@blt blt force-pushed the blt/antithesis-bug-tests branch 2 times, most recently from 13234ac to 1e30690 Compare May 29, 2026 20:34
@blt blt force-pushed the blt/antithesis-research branch 2 times, most recently from 3265657 to af46baf Compare May 29, 2026 20:38
@blt blt force-pushed the blt/antithesis-bug-tests branch from 1e30690 to 41f2aa6 Compare May 29, 2026 20:38
@blt blt force-pushed the blt/antithesis-research branch from af46baf to 5cb8545 Compare May 29, 2026 20:46
@blt blt force-pushed the blt/antithesis-bug-tests branch 2 times, most recently from 12c29ca to ed965e3 Compare May 29, 2026 20:55
@blt blt force-pushed the blt/antithesis-research branch from 5cb8545 to d604067 Compare May 29, 2026 20:55
@blt blt force-pushed the blt/antithesis-bug-tests branch from ed965e3 to 3047a17 Compare May 29, 2026 21:18
@blt blt force-pushed the blt/antithesis-research branch from d604067 to a649f24 Compare May 29, 2026 21:18
@blt blt force-pushed the blt/antithesis-bug-tests branch from 3047a17 to b7a0f77 Compare May 29, 2026 22:36
@blt blt force-pushed the blt/antithesis-research branch from a649f24 to 249a646 Compare May 29, 2026 22:37
@blt blt force-pushed the blt/antithesis-bug-tests branch from b7a0f77 to 87b87d3 Compare May 30, 2026 00:43
@blt blt force-pushed the blt/antithesis-research branch from 249a646 to 05efef3 Compare May 30, 2026 00:43
@blt blt force-pushed the blt/antithesis-bug-tests branch from 87b87d3 to c9ff705 Compare May 30, 2026 00:47
@blt blt force-pushed the blt/antithesis-research branch from 05efef3 to 08ac10a Compare May 30, 2026 00:47
@blt blt force-pushed the blt/antithesis-bug-tests branch from c9ff705 to 2f236aa Compare May 30, 2026 00:49
@blt blt force-pushed the blt/antithesis-research branch from 08ac10a to 267fc94 Compare May 30, 2026 00:49
@blt blt force-pushed the blt/antithesis-bug-tests branch from 2f236aa to 890f347 Compare May 30, 2026 00:52
@blt blt force-pushed the blt/antithesis-research branch from 267fc94 to 982687b Compare May 30, 2026 00:52
## Summary

Six TDD-style tests that each assert the behavior agent-data-plane *should* have and currently fail,
demonstrating a real defect. No production code changes — these are the failing tests a fix would
turn green. Root-cause notes for each are in `test/antithesis/scratchbook/bug-ledger.md`.

- aggregate: a sub-second `aggregate_window_duration` truncates to a 0-second bucket and panics on
  `timestamp % 0` at the first insert.
- aggregate: a forward wall-clock jump backfills zero-value points across the whole jump
  (O(jump) work and allocation), flooding output.
- ddsketch: a single non-finite sample silently poisons `sum`/`avg` (no finiteness guard).
- dogstatsd replay: a corrupt length prefix is read as a clean EOF, silently dropping the records
  after it.
- context resolver: with the default heap fallback, a full interner never refuses, so resolution is
  unbounded under high cardinality.
- config: `ready()` waits for the first dynamic snapshot with no timeout, so startup hangs forever if
  it never arrives.

## Change Type
- [ ] Bug fix
- [ ] New feature
- [x] Non-functional (chore, refactoring, docs)
- [ ] Performance

## How did you test this PR?

`cargo nextest run` on the six tests — all six fail, each for its intended reason (panic / NaN /
silent `Ok(None)` / point flood / unbounded resolution / `ready()` timeout). `cargo fmt --check` is
clean.

CI note: because these are intentionally-failing repros, the `unit-tests` jobs (`make test`) will be
**red** until the underlying bugs are fixed. Decide how to land them — e.g. `#[ignore]` with a
tracking issue per bug, or keep them red as known-failing demonstrations.

## References

- Root-cause analysis and the full ledger: `test/antithesis/scratchbook/bug-ledger.md` (research PR).
- The harness that can exercise some of these under fault injection: the harness PR in this stack.
@blt blt force-pushed the blt/antithesis-research branch from 982687b to fc4bb29 Compare May 30, 2026 00:56
@blt blt force-pushed the blt/antithesis-bug-tests branch from 890f347 to 2d5932a Compare May 30, 2026 00:56
@blt blt changed the base branch from blt/antithesis-research to graphite-base/1769 May 30, 2026 01:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/components Sources, transforms, and destinations. area/config Configuration. area/core Core functionality, event model, etc. source/dogstatsd DogStatsD source. transform/aggregate Aggregate transform.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant