Skip to content

fix: set maturity on parquet splits#6399

Open
mattmkim wants to merge 5 commits into
mainfrom
matthew.kim/byoc-metrics-maturity-timestamp
Open

fix: set maturity on parquet splits#6399
mattmkim wants to merge 5 commits into
mainfrom
matthew.kim/byoc-metrics-maturity-timestamp

Conversation

@mattmkim
Copy link
Copy Markdown
Contributor

@mattmkim mattmkim commented May 6, 2026

Description

We were not setting maturity value on splits in the metastore, so they were all being set to 0. Sets maturity value on the ingest/merge path on the uploader, similar to what logs does.

Need this for compaction, so we can fetch immature splits, and correctly designate when splits are no longer candidates for compaction.

How was this PR tested?

Describe how you tested this PR.

@mattmkim mattmkim requested review from a team as code owners May 6, 2026 18:19
@mattmkim
Copy link
Copy Markdown
Contributor Author

mattmkim commented May 6, 2026

@codex review

@mattmkim mattmkim changed the title set maturity on parquet splits fix: set maturity on parquet splits May 6, 2026
@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Keep it up!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4cb24ace41

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +270 to +271
if split.is_mature(now) {
continue;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Recompute maturity for legacy parquet splits

When a cluster restarts after upgrading, already-published parquet splits serialized before this new maturity field are deserialized with the field's default (Mature), so this split.is_mature(now) check discards them even if the merge policy check immediately above classifies their size/merge count as still immature. This hits the exact upgrade path where existing splits were created before the uploader started populating maturity, leaving those small legacy splits out of compaction; use the policy-derived maturation period for missing/defaulted metadata or otherwise distinguish legacy records instead of treating them as mature here.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

pub mature: Bound<OffsetDateTime>,

P2 Badge Default the new query field for old clients

When a newer metastore receives ListMetricsSplitsRequest/ListSketchSplitsRequest from a still-old indexing or query node, the query_json produced before this change will not contain mature; deserialize_query() calls serde_utils::from_json_str directly, so serde will reject those requests as missing this newly required field instead of treating it as unbounded. Add #[serde(default)] to keep parquet split listing compatible during rolling upgrades and with any stored/test JSON that predates this field.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants