test(trino): de-flake TestHudi*FileOperations by asserting only synchronous reads#19004
Conversation
hudi-agent
left a comment
There was a problem hiding this comment.
🤖 This review was generated by an AI agent and may contain mistakes. Please verify any suggestions before applying.
Thanks for working on this! This PR de-flakes the three Trino-plugin file-operation tests by filtering out metadata-table (and Alluxio) spans whose timing is driven by background threads, and keeps assertions only on the synchronous foreground reads — a reasonable trade-off given prior timing-based attempts proved fragile. No issues flagged from this automated pass — a Hudi committer or PMC member can take it from here for a final review.
cc @yihua
…ronous reads PR apache#18995 disabled the async table-statistics refresh to de-flake the three TestHudi*FileOperations classes, but they kept flaking on master with the same symmetric off-by-N mismatch in the metadata-table span counts. The stats refresh was only one of several asynchronous sources of filesystem-access spans. These tests assert the exact multiset of low-level filesystem spans a query emits. Trino resets the span exporter at the start of each executeWithPlan, so any span a background thread emits after the synchronous query returns lands in the next test's measurement window and scrambles the counts. The Hudi read path emits such spans from shared background pools that read the metadata table (split loading, partition listing, index support) and, for the Alluxio variant, from asynchronous cache population whose hit/miss outcome depends on whether an earlier cache write had completed. Rather than continue trying to time these background tasks, this change stops asserting the quantities they produce and keeps only the operations that happen synchronously on the foreground planning/scan path. getFileOperations now excludes METADATA_TABLE operations in all three classes, and additionally excludes all Alluxio.* operations in the Alluxio class; the corresponding expected entries are removed. hudi.table-statistics-enabled=false is kept because the stats executor also reads the index definition and table-property files on a background thread, which are part of the surviving asserted set. No production code is changed. Verified under JDK 23 / trino-root 472: the build succeeds with zero checkstyle violations and the three classes pass 20 consecutive runs.
939b5ce to
0f6b01c
Compare
hudi-agent
left a comment
There was a problem hiding this comment.
🤖 This review was generated by an AI agent and may contain mistakes. Please verify any suggestions before applying.
Thanks for working on this! The PR de-flakes three Trino-plugin file-operation tests by filtering out metadata-table operations (and Alluxio.* spans in the Alluxio variant) whose counts depend on background-thread timing, and trimming the expected multisets accordingly. The trade-off — reduced assertion coverage in exchange for determinism — is reasonable and the rationale is well-documented in the new comments. No issues flagged from this automated pass — a Hudi committer or PMC member can take it from here for a final review.
cc @yihua
| assertFileSystemAccesses( | ||
| query, | ||
| ImmutableMultiset.<FileOperation>builder() | ||
| .addCopies(new FileOperation("Alluxio.readCached", DATA), 2) |
There was a problem hiding this comment.
We need the test coverage on the caching functionality. Could we run the tests sequentially or independently to avoid race conditions that cause the flakiness?
There was a problem hiding this comment.
Done 189f4e0 - added testReadsServedFromAlluxioCache, which warms the cache and then asserts at least one Alluxio.readCached span. It is a lower bound rather than an exact count, so it is immune to the off-by-N leak, and it is wrapped in assertEventually to tolerate the asynchronous cache write. It passed 8 consecutive runs locally.
On running them sequentially or independently: these three classes already run isolated and single-threaded. Each is annotated @Execution(SAME_THREAD) and they share @ResourceLock("HUDI_CACHE_SYSTEM"), so they never run concurrently with one another and never interleave their own methods. The flake is not a race between tests. The spans leak because Trino resets the span exporter at the start of each executeWithPlan, while Hudi's background pools (split loading, partition listing, index support) and Alluxio's asynchronous cache population keep emitting spans after the synchronous query has returned, so those spans land in the next query's measurement window even on a single thread. That is why the earlier span-stability poll (#18766) and disabling the stats refresh (#18995) did not stop it, and why the remaining exact Alluxio.* counts cannot be made deterministic by ordering alone.
There was a problem hiding this comment.
@yihua could you please take a second look when you get chance.
It just fails too frequently, want to fix it asap.
The previous change stopped asserting the exact Alluxio.* span counts in TestHudiAlluxioCacheFileOperations because the cache write is asynchronous, so the per-query hit/miss counts flake. That left the Alluxio class with no assertion that the filesystem cache is actually engaged. This adds testReadsServedFromAlluxioCache, which warms the cache and then asserts that re-running the query produces at least one Alluxio.readCached span. The assertion is a lower bound rather than an exact count, so it is immune to the symmetric off-by-N leakage, and it is wrapped in assertEventually so it retries until the asynchronous cache write has landed and fails loudly at a 30s deadline if the cache never serves a read. Test-only change. Verified under JDK 23 / trino-root 472: the build succeeds with zero checkstyle violations and TestHudiAlluxioCacheFileOperations passes 8 consecutive runs.
|
Still recurring on unrelated PRs. #19014 (a ZK lock-provider change that touches no Trino code) just hit it: Failure log: https://github.com/apache/hudi/actions/runs/27592976334/job/81577578595 All three span types fall under the |
voonhous
left a comment
There was a problem hiding this comment.
Code looks correct to me.
There is a loss of regression signal. Dropping all METADATA_TABLE counts removes the test's ability to catch metadata-table read amplification (e.g., a future change that doubles metadata reads would now pass silently).
This is the real cost of the change, which we are sacrificing in favour of removing the flakiness, so i guess it's justified.
The symmetric leak goes in both directions (a test can read long or short), so even a tolerance/lower-bound assertion on metadata-table counts would still flake, unlike the cache-presence check which only ever accumulates.
Calling out this trade-off explicitly so we can circle back and think of how to bridge this gap later, rather than discovering it later.
|
Filed #19037 to track restoring the metadata-table read-amplification signal without bringing the flake back. The issue records the candidate bridging approaches - an aggregate/conservation assertion over the paired measurements, and a deterministic drain of the background metadata-table reader pools - so we can pick it up separately. Leaving it out of this PR to keep the de-flake clean. |
Describe the issue this Pull Request addresses
The three Trino-plugin tests
TestHudiNoCacheFileOperations,TestHudiMemoryCacheFileOperationsandTestHudiAlluxioCacheFileOperationsare still flaky after #18995. They intermittently fail in CI with a symmetric off-by-N mismatch in the metadata-table file-operation counts between the two paired measurements (for example on master CI run 27489081724,testJoinwas long by 49Alluxio.readCached/ 7lastModified/ 16lengthwhiletestSelectWithFilterwas short by almost exactly the same amounts). #18995 disabled the async table-statistics refresh, but that was only one of several asynchronous sources of these spans, so the flake remained.Summary and Changelog
These tests assert the exact multiset of low-level filesystem-access spans a single query emits. Trino resets the span exporter at the start of each
executeWithPlan, so any span emitted by a background thread after the synchronous query returns lands in the next test's measurement window and scrambles the counts (the symmetric off-by-N signature). The Hudi read path produces such spans from shared background pools that read the metadata table (split loading, partition listing, index support) and, in the Alluxio case, from asynchronous filesystem-cache population whose hit/miss outcome depends on whether an earlier cache write had already completed.Instead of continuing to try to time these background tasks (a fixed
Thread.sleep, then a span-stability poll in #18766, then disabling stats in #18995), this PR stops asserting the quantities they produce and keeps only the operations that happen synchronously on the foreground planning/scan path, which are well-defined per query.getFileOperationsnow excludesMETADATA_TABLEoperations in all three classes and additionally excludes allAlluxio.*operations in the Alluxio class, and the corresponding expected entries are removed.hudi.table-statistics-enabled=falseis kept because the stats executor also reads the index-definition and table-property files on a background thread, which are part of the surviving asserted set.No production code is changed, and no code was copied from third-party sources.
Impact
Test-only change, scoped to three test classes in
hudi-trino-plugin. No production code, public API, configuration default, or runtime behavior changes.Risk Level
none
Documentation Update
none
Contributor's checklist