chore(agent-data-plane): failing repros for six discovered bugs#1769
chore(agent-data-plane): failing repros for six discovered bugs#1769blt wants to merge 1 commit into
Conversation
|
Warning This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
This stack of pull requests is managed by Graphite. Learn more about stacking. |
|
Binary Size Analysis (Agent Data Plane)Baseline: 1bd1613 · Comparison: 2d5932a · diff ✅ Binary size difference within thresholdChanges by Module
Detailed Symbol Changes |
Regression Detector (Agent Data Plane)Run ID: Optimization Goals: ✅ No significant changes detectedFine details of change detection per experiment (35)Experiments configured
Bounds Checks: ✅ Passed (5)
ExplanationA change is flagged as a regression when |Δ mean %| > 5.00% in the regressing direction for its optimization goal AND SMP marks the experiment as a regression ( |
62e8952 to
f89f1cf
Compare
e840a73 to
8712b69
Compare
13234ac to
1e30690
Compare
3265657 to
af46baf
Compare
1e30690 to
41f2aa6
Compare
af46baf to
5cb8545
Compare
12c29ca to
ed965e3
Compare
5cb8545 to
d604067
Compare
ed965e3 to
3047a17
Compare
d604067 to
a649f24
Compare
3047a17 to
b7a0f77
Compare
a649f24 to
249a646
Compare
b7a0f77 to
87b87d3
Compare
249a646 to
05efef3
Compare
87b87d3 to
c9ff705
Compare
05efef3 to
08ac10a
Compare
c9ff705 to
2f236aa
Compare
08ac10a to
267fc94
Compare
2f236aa to
890f347
Compare
267fc94 to
982687b
Compare
## Summary Six TDD-style tests that each assert the behavior agent-data-plane *should* have and currently fail, demonstrating a real defect. No production code changes — these are the failing tests a fix would turn green. Root-cause notes for each are in `test/antithesis/scratchbook/bug-ledger.md`. - aggregate: a sub-second `aggregate_window_duration` truncates to a 0-second bucket and panics on `timestamp % 0` at the first insert. - aggregate: a forward wall-clock jump backfills zero-value points across the whole jump (O(jump) work and allocation), flooding output. - ddsketch: a single non-finite sample silently poisons `sum`/`avg` (no finiteness guard). - dogstatsd replay: a corrupt length prefix is read as a clean EOF, silently dropping the records after it. - context resolver: with the default heap fallback, a full interner never refuses, so resolution is unbounded under high cardinality. - config: `ready()` waits for the first dynamic snapshot with no timeout, so startup hangs forever if it never arrives. ## Change Type - [ ] Bug fix - [ ] New feature - [x] Non-functional (chore, refactoring, docs) - [ ] Performance ## How did you test this PR? `cargo nextest run` on the six tests — all six fail, each for its intended reason (panic / NaN / silent `Ok(None)` / point flood / unbounded resolution / `ready()` timeout). `cargo fmt --check` is clean. CI note: because these are intentionally-failing repros, the `unit-tests` jobs (`make test`) will be **red** until the underlying bugs are fixed. Decide how to land them — e.g. `#[ignore]` with a tracking issue per bug, or keep them red as known-failing demonstrations. ## References - Root-cause analysis and the full ledger: `test/antithesis/scratchbook/bug-ledger.md` (research PR). - The harness that can exercise some of these under fault injection: the harness PR in this stack.
982687b to
fc4bb29
Compare
890f347 to
2d5932a
Compare

Summary
Six TDD-style tests that each assert the behavior agent-data-plane should have and currently fail,
demonstrating a real defect. No production code changes — these are the failing tests a fix would
turn green. Root-cause notes for each are in
test/antithesis/scratchbook/bug-ledger.md.aggregate_window_durationtruncates to a 0-second bucket and panics ontimestamp % 0at the first insert.(O(jump) work and allocation), flooding output.
sum/avg(no finiteness guard).after it.
unbounded under high cardinality.
ready()waits for the first dynamic snapshot with no timeout, so startup hangs forever ifit never arrives.
Change Type
How did you test this PR?
cargo nextest runon the six tests — all six fail, each for its intended reason (panic / NaN /silent
Ok(None)/ point flood / unbounded resolution /ready()timeout).cargo fmt --checkisclean.
CI note: because these are intentionally-failing repros, the
unit-testsjobs (make test) will bered until the underlying bugs are fixed. Decide how to land them — e.g.
#[ignore]with atracking issue per bug, or keep them red as known-failing demonstrations.
References
test/antithesis/scratchbook/bug-ledger.md(research PR).