[MINOR] Cap UT_FT_10 Azure install to -T 2 to avoid flaky compiler heap OOM#19008
Merged
danny0405 merged 1 commit intoJun 16, 2026
Merged
Conversation
…ap OOM The UT_FT_10 Azure job builds the full reactor with `mvn clean install` under the shared MVN_OPTS_INSTALL `-T 3` inside a single `-Xmx8g` JVM, and intermittently OOMs during compilation (most recently hudi-kafka-connect-bundle) when several heavy concurrent builds align in time. Prepend `-T 2` to this job's install so Maven (first -T wins) caps its concurrency to 2 while every other job keeps the shared -T 3. Local measurement: the full install peaks at ~2.4 GB at -T 1 and -T 2 vs ~2.9 GB at -T 3, all well under the 8g ceiling, and -T 2 keeps essentially the same wall-clock as -T 3.
hudi-agent
reviewed
Jun 15, 2026
hudi-agent
left a comment
Contributor
There was a problem hiding this comment.
🤖 This review was generated by an AI agent and may contain mistakes. Please verify any suggestions before applying.
No reviewable code files in this PR.
cc @yihua
wombatu-kun
pushed a commit
to wombatu-kun/hudi
that referenced
this pull request
Jun 15, 2026
Collaborator
danny0405
approved these changes
Jun 16, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Describe the issue this Pull Request addresses
The Azure CI job "UT FT common & other modules" (UT_FT_10) intermittently fails its initial full-reactor
mvn clean installwithmaven-compiler-plugin:compile ... Bad service configuration file, or exception thrown while constructing Processor object: Java heap space, most recently while compilinghudi-kafka-connect-bundle(for example the Azure run for PR #19004, buildId 14704, where the change itself is unrelated test-only code). The job builds the entire reactor with-T 3inside a single-Xmx8gJVM on a memory-constrained Azure agent, so when several heavy module builds align in time the shared heap occasionally exceeds the 8g ceiling and OOMs. The annotation-processor line in the message is only where the allocation tips over, not the root cause.Summary and Changelog
This scopes a lower build parallelism to just the job that OOMs, without touching any source code or the shared install options used by the other jobs. UT_FT_10's
clean installnow prepends-T 2before$(MVN_OPTS_INSTALL)(which contains-T 3). Maven uses the first-Tit sees on the command line (verified:-T 2 -T 3resolves to a thread count of 2,-T 3 -T 2to 3), so the effective thread count for this job's install becomes 2 while every other job keeps the shared-T 3. Lowering the concurrency from 3 to 2 reduces how many heavy compiles and shade operations can run at the same time, and therefore the peak heap.The approach was chosen after measuring the heap profile of the full
clean installlocally: peak heap is about 2.4 GB at-T 1, about 2.4 GB at-T 2, and about 2.9 GB at-T 3, all far below the 8 GB ceiling, which shows the failure is a rare concurrency-driven tail spike rather than a systematic over-use of memory.-T 2keeps essentially the same wall-clock as-T 3(about half a minute slower in the local run) while bringing the measured peak heap back down to the-T 1level. An earlier idea to disable annotation processing on the bundle modules was measured and rejected, because it did not change the compile's heap requirement at all. No code was copied from third-party sources.Impact
CI-only change scoped to the UT_FT_10 Azure job. No production code, public API, configuration default, or runtime behavior changes. The job's install phase becomes slightly slower (about half a minute in the local measurement) in exchange for lower peak heap; all other CI jobs are unaffected.
Risk Level
none
CI-only change that lowers build parallelism for a single job. It cannot affect build output or test results, and the verified
-Tprecedence guarantees the intended thread count.Documentation Update
none
Contributor's checklist