Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions website/docs/basic_configurations.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,10 +93,10 @@ Options useful for reading tables via `read.format.option(...)`

| Config Name | Default | Description |
| -------------------------------------------------------------------------------------------- | --------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| [hoodie.datasource.read.begin.instanttime](#hoodiedatasourcereadbegininstanttime) | (N/A) | Required when `hoodie.datasource.query.type` is set to `incremental`. Represents the completion time to start incrementally pulling data from. The completion time here need not necessarily correspond to an instant on the timeline. New data written with completion_time &gt;= START_COMMIT are fetched out. For e.g: ‘20170901080000’ will get all new data written on or after Sep 1, 2017 08:00AM.<br />`Config Param: START_COMMIT`<br />`Since Version: 0.9.0` |
| [hoodie.datasource.read.end.instanttime](#hoodiedatasourcereadendinstanttime) | (N/A) | Used when `hoodie.datasource.query.type` is set to `incremental`. Represents the completion time to limit incrementally fetched data to. When not specified latest commit completion time from timeline is assumed by default. When specified, new data written with completion_time &lt;= END_COMMIT are fetched out. Point in time type queries make more sense with begin and end completion times specified.<br />`Config Param: END_COMMIT`<br />`Since Version: 0.9.0` |
| [hoodie.datasource.read.incr.table.version](#hoodiedatasourcereadincrtableversion) | (N/A) | The table version assumed for incremental read<br />`Config Param: INCREMENTAL_READ_TABLE_VERSION`<br />`Since Version: 1.0.0` |
| [hoodie.datasource.read.streaming.table.version](#hoodiedatasourcereadstreamingtableversion) | (N/A) | The table version assumed for streaming read<br />`Config Param: STREAMING_READ_TABLE_VERSION`<br />`Since Version: 1.0.0` |
| [hoodie.datasource.read.begin.instanttime](#hoodiedatasourcereadbegininstanttime) | (N/A) | Required when `hoodie.datasource.query.type` is set to `incremental`. The start point (exclusive) to begin incrementally pulling data from. The semantics depend on the effective table version (overridable via `hoodie.datasource.read.incr.table.version` for incremental reads or `hoodie.datasource.read.streaming.table.version` for streaming reads; otherwise the source table's actual version): version 8 or later treats this as a completion time, earlier versions (e.g., version 6) treat it as a requested time (instant time). The value need not necessarily correspond to an instant on the timeline. New data written strictly after START_COMMIT are fetched out. For e.g. ‘20170901080000’ will get all new data written strictly after Sep 1, 2017 08:00AM.<br />`Config Param: START_COMMIT`<br />`Since Version: 0.9.0` |
| [hoodie.datasource.read.end.instanttime](#hoodiedatasourcereadendinstanttime) | (N/A) | Used when `hoodie.datasource.query.type` is set to `incremental`. The end point (inclusive) to limit incrementally fetched data to. Same time-semantics rules as START_COMMIT: version 8 or later treats this as a completion time, earlier versions (e.g., version 6) treat it as a requested time (overridable via `hoodie.datasource.read.incr.table.version` or `hoodie.datasource.read.streaming.table.version`). When not specified, the latest committed instant from the timeline is used. Point in time type queries make more sense with both begin and end specified.<br />`Config Param: END_COMMIT`<br />`Since Version: 0.9.0` |
| [hoodie.datasource.read.incr.table.version](#hoodiedatasourcereadincrtableversion) | (N/A) | Overrides the table version assumed for incremental reads. Version 8+ selects the V2 incremental relation (completion-time based START_COMMIT/END_COMMIT); earlier versions select the V1 relation (requested-time based). If unset, the source table's actual version is used.<br />`Config Param: INCREMENTAL_READ_TABLE_VERSION`<br />`Since Version: 1.0.0` |
| [hoodie.datasource.read.streaming.table.version](#hoodiedatasourcereadstreamingtableversion) | (N/A) | Overrides the table version assumed for streaming reads. Version 8+ selects HoodieStreamSourceV2 (completion-time based START_COMMIT/END_COMMIT); earlier versions select HoodieStreamSourceV1 (requested-time based). If unset, the source table's actual version is used.<br />`Config Param: STREAMING_READ_TABLE_VERSION`<br />`Since Version: 1.0.0` |
| [hoodie.datasource.write.precombine.field](#hoodiedatasourcewriteprecombinefield) | (N/A) | Comma separated list of fields used in preCombining before actual write. When two records have the same key value, we will pick the one with the largest value for the precombine field, determined by Object.compareTo(..). For multiple fields if first key comparison is same, second key comparison is made and so on. This config is used for combining records within the same batch and also for merging using event time merge mode<br />`Config Param: READ_PRE_COMBINE_FIELD` |
| [hoodie.datasource.query.type](#hoodiedatasourcequerytype) | snapshot | Whether data needs to be read, in `incremental` mode (new data since an instantTime) (or) `read_optimized` mode (obtain latest view, based on base files) (or) `snapshot` mode (obtain latest view, by merging base and (if any) log files)<br />`Config Param: QUERY_TYPE`<br />`Since Version: 0.9.0` |
---
Expand Down
8 changes: 4 additions & 4 deletions website/docs/configurations.md
Original file line number Diff line number Diff line change
Expand Up @@ -123,10 +123,10 @@ Options useful for reading tables via `read.format.option(...)`

| Config Name | Default | Description |
| -------------------------------------------------------------------------------------------- | --------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| [hoodie.datasource.read.begin.instanttime](#hoodiedatasourcereadbegininstanttime) | (N/A) | Required when `hoodie.datasource.query.type` is set to `incremental`. Represents the completion time to start incrementally pulling data from. The completion time here need not necessarily correspond to an instant on the timeline. New data written with completion_time &gt;= START_COMMIT are fetched out. For e.g: ‘20170901080000’ will get all new data written on or after Sep 1, 2017 08:00AM.<br />`Config Param: START_COMMIT`<br />`Since Version: 0.9.0` |
| [hoodie.datasource.read.end.instanttime](#hoodiedatasourcereadendinstanttime) | (N/A) | Used when `hoodie.datasource.query.type` is set to `incremental`. Represents the completion time to limit incrementally fetched data to. When not specified latest commit completion time from timeline is assumed by default. When specified, new data written with completion_time &lt;= END_COMMIT are fetched out. Point in time type queries make more sense with begin and end completion times specified.<br />`Config Param: END_COMMIT`<br />`Since Version: 0.9.0` |
| [hoodie.datasource.read.incr.table.version](#hoodiedatasourcereadincrtableversion) | (N/A) | The table version assumed for incremental read<br />`Config Param: INCREMENTAL_READ_TABLE_VERSION`<br />`Since Version: 1.0.0` |
| [hoodie.datasource.read.streaming.table.version](#hoodiedatasourcereadstreamingtableversion) | (N/A) | The table version assumed for streaming read<br />`Config Param: STREAMING_READ_TABLE_VERSION`<br />`Since Version: 1.0.0` |
| [hoodie.datasource.read.begin.instanttime](#hoodiedatasourcereadbegininstanttime) | (N/A) | Required when `hoodie.datasource.query.type` is set to `incremental`. The start point (exclusive) to begin incrementally pulling data from. The semantics depend on the effective table version (overridable via `hoodie.datasource.read.incr.table.version` for incremental reads or `hoodie.datasource.read.streaming.table.version` for streaming reads; otherwise the source table's actual version): version 8 or later treats this as a completion time, earlier versions (e.g., version 6) treat it as a requested time (instant time). The value need not necessarily correspond to an instant on the timeline. New data written strictly after START_COMMIT are fetched out. For e.g. ‘20170901080000’ will get all new data written strictly after Sep 1, 2017 08:00AM.<br />`Config Param: START_COMMIT`<br />`Since Version: 0.9.0` |
| [hoodie.datasource.read.end.instanttime](#hoodiedatasourcereadendinstanttime) | (N/A) | Used when `hoodie.datasource.query.type` is set to `incremental`. The end point (inclusive) to limit incrementally fetched data to. Same time-semantics rules as START_COMMIT: version 8 or later treats this as a completion time, earlier versions (e.g., version 6) treat it as a requested time (overridable via `hoodie.datasource.read.incr.table.version` or `hoodie.datasource.read.streaming.table.version`). When not specified, the latest committed instant from the timeline is used. Point in time type queries make more sense with both begin and end specified.<br />`Config Param: END_COMMIT`<br />`Since Version: 0.9.0` |
| [hoodie.datasource.read.incr.table.version](#hoodiedatasourcereadincrtableversion) | (N/A) | Overrides the table version assumed for incremental reads. Version 8+ selects the V2 incremental relation (completion-time based START_COMMIT/END_COMMIT); earlier versions select the V1 relation (requested-time based). If unset, the source table's actual version is used.<br />`Config Param: INCREMENTAL_READ_TABLE_VERSION`<br />`Since Version: 1.0.0` |
| [hoodie.datasource.read.streaming.table.version](#hoodiedatasourcereadstreamingtableversion) | (N/A) | Overrides the table version assumed for streaming reads. Version 8+ selects HoodieStreamSourceV2 (completion-time based START_COMMIT/END_COMMIT); earlier versions select HoodieStreamSourceV1 (requested-time based). If unset, the source table's actual version is used.<br />`Config Param: STREAMING_READ_TABLE_VERSION`<br />`Since Version: 1.0.0` |
| [hoodie.datasource.write.precombine.field](#hoodiedatasourcewriteprecombinefield) | (N/A) | Comma separated list of fields used in preCombining before actual write. When two records have the same key value, we will pick the one with the largest value for the precombine field, determined by Object.compareTo(..). For multiple fields if first key comparison is same, second key comparison is made and so on. This config is used for combining records within the same batch and also for merging using event time merge mode<br />`Config Param: READ_PRE_COMBINE_FIELD` |
| [hoodie.datasource.query.type](#hoodiedatasourcequerytype) | snapshot | Whether data needs to be read, in `incremental` mode (new data since an instantTime) (or) `read_optimized` mode (obtain latest view, based on base files) (or) `snapshot` mode (obtain latest view, by merging base and (if any) log files)<br />`Config Param: QUERY_TYPE`<br />`Since Version: 0.9.0` |

Expand Down
Loading