fix(flink): fix the mor small file record size estimation#18991
Conversation
hudi-agent
left a comment
There was a problem hiding this comment.
🤖 This review was generated by an AI agent and may contain mistakes. Please verify any suggestions before applying.
Thanks for the contribution! The change tightens MOR write profiling by normalizing log bytes via the existing log-to-parquet compression ratio and always refreshing recordsPerBucket from profiled record size — both changes line up with what DeltaWriteProfile.getTotalFileSize already does for small-file sizing, so the downstream SmallFileAssignState arithmetic stays consistent. A couple of edge cases worth double-checking in the inline comments, in addition to the points the other reviewers have already raised. Please take a look at any inline comments, and this should be ready for a Hudi committer or PMC member to take it from here. A couple of small naming and literal-type nits below; the overall change is clean and well-structured.
| @@ -228,10 +232,8 @@ private void cleanMetadataCache(Stream<HoodieInstant> instants) { | |||
|
|
|||
| private void recordProfile() { | |||
There was a problem hiding this comment.
🤖 Removing the shouldAllowMultiWriteOnSameInstant gate means COPY_ON_WRITE_INSERT_SPLIT_SIZE is now silently ignored on the Flink side, while the Spark UpsertPartitioner still honors it when shouldAutoTuneInsertSplits() is false. Was the intent to fully deprecate that config for Flink, or should this be gated on shouldAutoTuneInsertSplits() (defaulting to true) to mirror Spark? @danny0405
- AI-generated; verify before applying. React 👍/👎 to flag quality.
hudi-agent
left a comment
There was a problem hiding this comment.
🤖 This review was generated by an AI agent and may contain mistakes. Please verify any suggestions before applying.
Thanks for the updates! The rename to fileSizeCalibrationRatio() and the 1D literal address the prior nit/readability comments, and the new logFileToParquetCompressionRatio() helper plus accompanying test correctly skips the compression ratio when log blocks are already stored as parquet. The two open items from the prior round — wombatu-kun's notes on recordsPerBucket being set unconditionally and on MOR commits that mix base-parquet and log bytes (e.g., bootstrap inserts), plus the shouldAllowMultiWriteOnSameInstant question — are not addressed in this revision and worth a committer's judgment. Please take a look at any inline comments, and this should be ready for a Hudi committer or PMC member to take it from here.
|
flaky trino CI should be fixed by #19004 |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #18991 +/- ##
============================================
- Coverage 68.24% 67.64% -0.61%
- Complexity 29478 29795 +317
============================================
Files 2542 2562 +20
Lines 142541 145178 +2637
Branches 17798 18337 +539
============================================
+ Hits 97281 98204 +923
- Misses 37254 38753 +1499
- Partials 8006 8221 +215
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
| } | ||
| } | ||
| } | ||
| log.info("Refresh average bytes per record => " + avgSize); |
There was a problem hiding this comment.
Nit:
log.info("Refresh average bytes per record => {}", avgSize);
| log.info("Refresh insert records per bucket => " + recordsPerBucket); | ||
| } | ||
| this.recordsPerBucket = config.getParquetMaxFileSize() / avgSize; | ||
| log.info("Refresh insert records per bucket => " + recordsPerBucket); |
There was a problem hiding this comment.
Nit:
log.info("Refresh insert records per bucket => {}", recordsPerBucket);
Describe the issue this Pull Request addresses
Flink
MERGE_ON_READwrite profiling estimates how many records can fit into small-file and insert buckets based on average record size. For MOR tables, commit metadata may include log-file bytes, which need to be normalized to estimated parquet size usinghoodie.logfile.to.parquet.compression.ratio. Without that adjustment, Flink can overestimate record size and under-fill buckets.The write profile also only refreshed
recordsPerBucketfrom profiled average record size when multi-write-on-same-instant was enabled. That made normal writes continue using the configured split-size fallback instead of the current profile-derived capacity.Summary and Changelog
This PR updates Flink write profiling so MOR average record size uses the existing log-to-parquet compression ratio, and insert bucket capacity is refreshed from profiled record size consistently.
fileSizeParquetCompressionRatio()hook inWriteProfile, defaulting to1.DeltaWriteProfileto useconfig.getLogFileToParquetCompressionRatio().recordsPerBucketfromparquetMaxFileSize / avgSizeunconditionally during profile construction/reload.TestBucketAssignerto cover profile-derived bucket sizing and MOR commit-metadata sizing with compression ratio.Impact
This affects Flink sink bucket assignment for Hudi write profiles, especially
MERGE_ON_READtables. MOR small-file and insert bucket sizing should better reflect estimated compacted parquet size.There is no public API change, storage format change, config rename, or compatibility break. Existing configuration is reused.
Risk Level
medium
The change is in Flink write-path sizing logic and can affect how records are assigned to insert buckets and small files. The risk is mitigated by targeted coverage in
TestBucketAssigner, including a real MOR write commit and validation that average record size and records-per-bucket use the compression-adjusted metadata estimate.Validated with:
mvn -pl hudi-flink-datasource/hudi-flink -am -Dtest=TestBucketAssigner -Dsurefire.failIfNoSpecifiedTests=false testDocumentation Update
none
Contributor's checklist