Skip to content

[Trino] Restore metadata-table read-amplification coverage in TestHudi*FileOperations lost to the span-leak de-flake #19037

@wombatu-kun

Description

@wombatu-kun

Task Description

What needs to be done:

PR #19004 de-flaked the three Trino-plugin file-operation tests (TestHudiNoCacheFileOperations, TestHudiMemoryCacheFileOperations, TestHudiAlluxioCacheFileOperations) by dropping all METADATA_TABLE operations from getFileOperations (and all Alluxio.* operations in the Alluxio class) before asserting the per-query multiset of filesystem-access spans. That removed the per-query flakiness but also removed the assertions' ability to detect metadata-table read amplification: a future change that, for example, doubles the number of metadata-table reads per query would now pass silently because no test counts those reads anymore.

Find a way to restore a regression signal on metadata-table read volume for these tests without re-introducing the span-leak flakiness that #19004 (and the earlier #18766 / #18995) fought.

Why this task is needed:

The metadata-table read counts were the main thing these FileOperations tests pinned down - how many low-level reads each query issues against the metadata table. After #19004 the metadata-table dimension is no longer asserted at all, so read-amplification regressions on the Trino read path are now invisible to CI. (The Alluxio cache-hit dimension is separately re-covered by the count-independent testReadsServedFromAlluxioCache added in the same PR, so only the metadata-table dimension is uncovered.)

Background: why the obvious fixes do not work

Trino resets the OpenTelemetry span exporter at the start of each executeWithPlan, so any span emitted by a Hudi background thread (the shared split-loader / split-manager / ForkJoinPool.commonPool pools that read the metadata table) after the synchronous query returns lands in the next measurement window. The result is a symmetric off-by-N: one query is counted long and the paired query short by almost the same amount.

  • An exact-count assertion on metadata-table spans flakes - this is the original failure.
  • A tolerance / lower-bound assertion on metadata-table spans still flakes, because the leak is bidirectional: a query can be counted short (its own spans leaked out) as well as long, and a lower bound is violated by the short case. This is the key difference from the Alluxio cache-hit check, where leaked spans only ever add hits (monotonic), so a lower bound there is safe.

Candidate directions (to validate, not decided)

These are hypotheses for the follow-up, not a committed design:

  1. Aggregate / conservation assertion. The leak shifts spans between adjacent windows but does not create or destroy them, so the total metadata-table read count across the paired measurements (or across the whole test class) should be conserved even though the per-query split is not. Asserting that aggregate would still catch a 2x amplification (which doubles the total) while tolerating the attribution jitter. Needs validation that nothing leaks past the chosen aggregation boundary (for example the last query's late spans).
  2. Deterministic drain / quiesce of the background metadata-table reader pools before the measurement window closes, so the metadata-table spans are captured inside the synchronous query window and exact counts become deterministic again. The obstacle is that those pools are shared / global with no clean await hook exposed to the Trino test harness.
  3. (Recorded as rejected) A span-stability poll - the test(trino): de-flake TestHudi*FileOperations by polling for span stability #18766 approach - did not bound the race and is not a path to revisit.

Task Type

Test enhancement

Related Issues

Metadata

Metadata

Assignees

No one assigned

    Labels

    area:testsTesting-relatedtype:devtaskDevelopment tasks and maintenance work

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions