Skip to content

docs(configs): START_COMMIT is exclusive, not inclusive#18955

Draft
yihua wants to merge 2 commits into
apache:asf-sitefrom
yihua:fix-start-commit-doc-website
Draft

docs(configs): START_COMMIT is exclusive, not inclusive#18955
yihua wants to merge 2 commits into
apache:asf-sitefrom
yihua:fix-start-commit-doc-website

Conversation

@yihua

@yihua yihua commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Describe the issue this Pull Request addresses

The published configuration pages on the Hudi docs site say New data written with completion_time >= START_COMMIT are fetched out for hoodie.datasource.read.begin.instanttime. This contradicts the actual runtime behavior, which treats START_COMMIT as exclusive:

  • V1 relation: timeline filter is findInstantsInRange(start, end) which is (start, end].
  • V2 relation: defaults to RangeType.OPEN_CLOSED after the apache/hudi PR that made the start commit exclusive.

A companion PR in apache/hudi (master branch) updates the underlying DataSourceOptions.scala config description.

Summary and Changelog

Updates the latest, 1.1.x, and 1.2.x configuration pages to reflect that START_COMMIT is exclusive: > instead of >=, and strictly after instead of on or after. Six files touched (configurations.md and basic_configurations.md for each of website/docs, website/versioned_docs/version-1.1.1, and website/versioned_docs/version-1.2.0).

Impact

Documentation only. No code change.

Risk Level

none

Documentation Update

This PR is the documentation update.

Contributor's checklist

  • Read through contributor's guide
  • Enough context is provided in the sections above
  • Adequate tests were added if applicable

yihua added 2 commits June 9, 2026 20:06
Updates latest, 1.1.x, and 1.2.x configuration pages to reflect that
Spark's incremental query treats the START_COMMIT option as exclusive
(completion_time > START_COMMIT), matching the V1 relation's start-
exclusive findInstantsInRange and the V2 relation's RangeType.OPEN_CLOSED.
Also updates docs for hoodie.datasource.read.incr.table.version and
hoodie.datasource.read.streaming.table.version (they override the
detected source table version and thus the time-semantics).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant