Skip to content

fix(catalog): abort doCommit on stale-base divergence#612

Merged
mkuchenbecker merged 1 commit into
linkedin:mainfrom
mkuchenbecker:mkuchenb/bug-repro-snapshot-rollback
May 29, 2026
Merged

fix(catalog): abort doCommit on stale-base divergence#612
mkuchenbecker merged 1 commit into
linkedin:mainfrom
mkuchenbecker:mkuchenb/bug-repro-snapshot-rollback

Conversation

@mkuchenbecker
Copy link
Copy Markdown
Collaborator

@mkuchenbecker mkuchenbecker commented May 28, 2026

Summary

Adds a catalog-level CAS in OpenHouseInternalTableOperations.doCommit that aborts when the writer's declared base (COMMIT_KEY) does not match the catalog's current persisted base — closing the BaseTransaction.applyUpdates silent-rebase variant of the stale-base lost-update bug (incident-12185).

Mechanism

  1. A writer's transaction stages COMMIT_KEY = T_X and a SNAPSHOTS_JSON_KEY payload computed against T_X.
  2. A racing commit advances the catalog T_X → T_Y, adding a snapshot.
  3. BaseTransaction.applyUpdates silently refreshes the in-flight base to T_Y and re-applies the staged update — re-stamping COMMIT_KEY = T_X on top of T_Y while leaving the stale SNAPSHOTS_JSON_KEY.
  4. doCommit runs with base = T_Y but COMMIT_KEY = T_X. Without the check, the subtractive snapshot merge computes toRemove = T_Y.snapshots() − stale payload = {racing snapshot} and silently drops it.

The fix reads COMMIT_KEY before failIfRetryUpdate strips it, URI-normalizes both paths via Hadoop Path, and throws CommitFailedException on mismatch so Iceberg retries against the fresh base.

Scope of the check

  • Aborts when COMMIT_KEY is a concrete location differing from the catalog base, or is INITIAL_VERSION.
  • Does not defend commits that leave COMMIT_KEY unset — wholesale replace/create (replaceTable, stage-create, stage-replace) are authoritative over the snapshot set, so there is no stale base to compare against.

Changes

Bug fix + unit test, internal catalog only (2 files). doCommit may now throw CommitFailedException on a stale-base commit that previously silently dropped a racing snapshot.

Testing

Unit test testDoCommitMustAbortStaleBaseRebaseToPreventSnapshotLoss in OpenHouseInternalTableOperationsTest round-trips the post-refresh TableMetadata through TableMetadataParser so base.metadataFileLocation() is non-null (matching a loaded-from-disk base) and the URI-normalized comparison runs. Asserts CommitFailedException is thrown and houseTableRepository.save is never invoked.

./gradlew :iceberg:openhouse:internalcatalog:test --tests "*OpenHouseInternalTableOperationsTest"

Not covered

A Spark concurrent-insert behavior test (SparkConcurrentInsertFunctionalTest, PR #614) was explored and removed: it only reproduces against the H2 test fixture, not production MySQL+HTS. A prod-realistic black-box repro would need the real HTS app or a deployed instance.

🤖 Generated with Claude Code

mkuchenbecker added a commit to mkuchenbecker/openhouse that referenced this pull request May 28, 2026
…eck, guard URI.create

- Pull COMMIT_KEY capture into abortIfWriterBaseDivergedFromCatalog so the
  read+check stay colocated (addresses #3319525189). Signature now takes
  Map<String, String> properties; caller's only job is to invoke it before
  failIfRetryUpdate strips COMMIT_KEY.
- Guard isSameMetadataPath against malformed URI inputs that would throw
  IllegalArgumentException out of doCommit as an unclassified runtime error
  (addresses #3319521822). Fall back to literal-inequality so divergence is
  reported as CommitFailedException via the caller, which Iceberg's retry
  loop classifies as retriable.
@mkuchenbecker mkuchenbecker changed the title test(internalcatalog): repro silent snapshot drop on stale-base commit [DRAFT] fix(internalcatalog): catalog-level CAS for stale-base commits + black-box repro [PARTIAL; PR #614 case still failing] May 28, 2026
@mkuchenbecker mkuchenbecker changed the title fix(internalcatalog): catalog-level CAS for stale-base commits + black-box repro [PARTIAL; PR #614 case still failing] fix(catalog): @Version on HouseTable + doCommit stale-base CAS — closes PR #614's race May 28, 2026
mkuchenbecker added a commit to mkuchenbecker/openhouse that referenced this pull request May 28, 2026
…elper

Single rewrite of abortIfWriterBaseDivergedFromCatalog addressing all
five inline comments:

- Inlined writerClaimsNoExistingBase (trivial check; #3320098729).
- Inlined isSameMetadataPath (#3320106771).
- Switched URI.create -> org.apache.hadoop.fs.Path(...).toUri().getPath();
  Hadoop Path is lenient and does not require the try/catch wrapper
  guarding malformed inputs (#3320093363).
- Collapsed the two-throw structure (no-existing-base + path-mismatch)
  into a single positive check writerBaseMatchesCatalog, throwing on
  negation (#3320108911).
- Stripped HTML formatting from the javadoc (#3320103880).

Net: one helper, one check, one throw. Same semantics — racing-CREATE
(writer claims null/INITIAL against persisted catalog) and racing-UPDATE
(path mismatch) are both caught by the single negation. The error message
collapses to the path-mismatch flavor; the test
testDoCommitAbortsOnStaleClaimedBase still asserts on "Cannot commit"
which is preserved.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@mkuchenbecker mkuchenbecker changed the title fix(catalog): @Version on HouseTable + doCommit stale-base CAS — closes PR #614's race fix(catalog): abort doCommit on stale-base divergence May 28, 2026
mkuchenbecker added a commit to mkuchenbecker/openhouse that referenced this pull request May 28, 2026
…view)

Per inline review on line 611: base non-null with claimed base null/INITIAL
is its own error class — make it explicit instead of folding into the
combined check. Distinct error message names the case.
kamanavishnu added a commit to kamanavishnu/openhouse that referenced this pull request May 29, 2026
…ase lost update

Reconstructs the BaseTransaction.applyUpdates silent-rebase inputs at the doCommit
boundary (base=T_Y containing the racing snapshot, but COMMIT_KEY=T_X) and asserts
doCommit aborts instead of silently dropping the racing snapshot via the subtractive
merge; also asserts save() is never reached.

Fails on main (racing snapshot silently dropped); passes once the doCommit
stale-base CAS (OSS PR linkedin#612) lands.
mkuchenbecker added a commit to mkuchenbecker/openhouse that referenced this pull request May 29, 2026
…base-repro-test

test(internalcatalog): assert linkedin#612 prevents the stale-base lost update (no silent snapshot drop)
@mkuchenbecker mkuchenbecker marked this pull request as ready for review May 29, 2026 05:12
@mkuchenbecker mkuchenbecker changed the base branch from main to hotfix/0.5.417 May 29, 2026 06:14
@mkuchenbecker mkuchenbecker changed the base branch from hotfix/0.5.417 to main May 29, 2026 06:20
mkuchenbecker added a commit to mkuchenbecker/openhouse that referenced this pull request May 29, 2026
Backport of PR linkedin#612 onto hotfix/0.5.417. Aborts doCommit when the writer's
declared base (COMMIT_KEY) diverges from the catalog's actual persisted base,
preventing the BaseTransaction.applyUpdates silent-rebase lost-update
(incident-12185). Propagates IS_REPLACE_COMMIT_KEY so wholesale replace
commits are exempted from the CAS, and strips it before persisting.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@mkuchenbecker mkuchenbecker force-pushed the mkuchenb/bug-repro-snapshot-rollback branch from c49eaec to 869cff8 Compare May 29, 2026 06:29
@mkuchenbecker mkuchenbecker changed the base branch from main to hotfix/0.5.417 May 29, 2026 06:29
mkuchenbecker added a commit to mkuchenbecker/openhouse that referenced this pull request May 29, 2026
Backport of PR linkedin#612 onto hotfix/0.5.417. Aborts doCommit when the writer's
declared base (COMMIT_KEY) diverges from the catalog's actual persisted base,
preventing the BaseTransaction.applyUpdates silent-rebase lost-update
(incident-12185), where applyUpdates re-stamps the writer's original non-null
COMMIT_KEY on top of a concurrently-advanced base.

Commits that leave COMMIT_KEY unset (wholesale replace/create) are authoritative
over the snapshot set and are intentionally not defended.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@mkuchenbecker mkuchenbecker force-pushed the mkuchenb/bug-repro-snapshot-rollback branch from 869cff8 to 8cca103 Compare May 29, 2026 06:51
@mkuchenbecker mkuchenbecker changed the base branch from hotfix/0.5.417 to main May 29, 2026 07:25
Aborts doCommit when the writer's declared base (COMMIT_KEY) diverges from the
catalog's actual persisted base, preventing the BaseTransaction.applyUpdates
silent-rebase lost-update (incident-12185), where applyUpdates re-stamps the
writer's original non-null COMMIT_KEY on top of a concurrently-advanced base.

Commits that leave COMMIT_KEY unset (wholesale replace/create) are authoritative
over the snapshot set and are intentionally not defended.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@mkuchenbecker mkuchenbecker force-pushed the mkuchenb/bug-repro-snapshot-rollback branch from f8e0c8f to ebd7230 Compare May 29, 2026 07:28
Copy link
Copy Markdown
Collaborator

@cbb330 cbb330 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works as a mitigation for testing purposes though it'll need further analysis before we can deploy safely.

Short-term, I think it's worth taking a closer look at this code: https://github.com/linkedin/openhouse/pull/612/changes#diff-526cc78ff45a139113a29951f4db0a28242b9a7f3bc9caef9ceec26b04c5aeb8R331

it may point us toward a more targeted, reliable, and race-condition-free solution.

@mkuchenbecker mkuchenbecker merged commit 9407819 into linkedin:main May 29, 2026
1 check passed
mkuchenbecker added a commit to mkuchenbecker/openhouse that referenced this pull request May 29, 2026
Validated by running pre-fix (d4fc9fe): both tests FAIL (racing snapshot silently
dropped); post-fix (linkedin#612): both PASS.

Three layered guards must all be bypassed for the lost update:
  1. HTS optimistic-version CAS  -> bypassed by applyUpdates' silent rebase (held txn)
  2. failIfRetryUpdate per-JVM cache -> bypassed cross-replica (cleared in the test)
  3. Iceberg snapshot sequence-number validation -> only bypassed by a SUBTRACTIVE
     stale commit (adds no new snapshot). A stale writer adding its own snapshot on a
     multi-snapshot base is rejected by guard #3, so it cannot lose data -- which is
     why incident-12185 was an expire/optimizer drop, not an insert race.

Tests (both subtractive): expire racing a data insert (prior history); two concurrent
first-inserts on a fresh table.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants