Skip to content

fix(datatrak): RN-1868: Fix entity temporarily deleted#6747

Open
chris-bes wants to merge 11 commits into
devfrom
rn-1868-entity-temporarily-deleted
Open

fix(datatrak): RN-1868: Fix entity temporarily deleted#6747
chris-bes wants to merge 11 commits into
devfrom
rn-1868-entity-temporarily-deleted

Conversation

@chris-bes

@chris-bes chris-bes commented Apr 30, 2026

Copy link
Copy Markdown
Contributor

Changes:

  • Fix entity temporarily deleted

🦸 Review Hero

  • Run Review Hero
  • Auto-fix review suggestions
  • Auto-fix CI failures

);
if (!isNullish(visibleUntilTick)) {
return Number.parseInt(visibleUntilTick, 10);
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Bugs & Correctness] critical

getSyncLookupVisibleUntilTick() returns Number.MAX_SAFE_INTEGER when LOOKUP_VISIBLE_UNTIL_TICK is not yet set. This value flows into pull_until (line 397) which is returned to the client as pullUntil (fetchPullMetadata) and then stored as the client's LAST_SUCCESSFUL_SYNC_PULL cursor. On the next sync, that client passes since=Number.MAX_SAFE_INTEGER; no real sync tick will ever exceed that, so the client never receives another record — sync is permanently broken for that device. The fallback comment says this 'can only happen before updateLookupTable has written the new watermark', but it ignores the upgrade scenario: an existing deployment already has LOOKUP_UP_TO_TICK set (from old code), so prepareSession passes the guard, but LOOKUP_VISIBLE_UNTIL_TICK is absent until the first post-upgrade updateLookupTable run. Any sync that races that window is corrupted. Fix: fall back to the LOOKUP_UP_TO_TICK value instead of Number.MAX_SAFE_INTEGER — it represents the same semantic ('last coherent lookup state') and is always a valid cursor bound.

@@ -559,6 +588,15 @@ export class CentralSyncManager {
SyncFact.LOOKUP_UP_TO_TICK,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Bugs & Correctness] suggestion

In the incremental update path, LOOKUP_UP_TO_TICK is written at tick T1 (first transaction) but LOOKUP_VISIBLE_UNTIL_TICK is written at tick T2 (second transaction, after a fresh tickTockGlobalClock call, so T2 = T1+1). Any record created in the T1→T2 window lands in sync_lookup only on the next updateLookupTable pass, but a delete trigger can write its tombstone into sync_lookup at tick ≤ T2, making the tombstone visible at the watermark while the corresponding create row is absent — exactly the race this PR fixes. The window is extremely narrow (milliseconds between two transactions) but structurally the same class of bug. Fix: derive LOOKUP_VISIBLE_UNTIL_TICK from T1 (reuse the tick stored into LOOKUP_UP_TO_TICK) rather than issuing a new tickTockGlobalClock in the second transaction, so the watermark matches the exact tick through which sync_lookup is coherent.

database: TupaiaDatabase,
models: DatabaseModel[],
since: number,
until: number,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Design & Architecture] suggestion

until: number uses Number.MAX_SAFE_INTEGER as a sentinel meaning 'no upper bound'. This leaks the sentinel into every call site — three test call sites and two production call sites all have to pass Number.MAX_SAFE_INTEGER explicitly to opt out of bounding, and the SQL always emits AND updated_at_sync_tick <= 9007199254740991 even when there is no real bound. An until?: number optional parameter with a conditional ${until != null ? 'AND updated_at_sync_tick <= ?' : ''} SQL fragment (and only conditionally pushing the value into the params array) would make the 'unbounded' intent explicit at every call site and remove the no-op SQL clause.

SyncFact.LOOKUP_UP_TO_TICK,
currentTick.toString(),
);
if (isInitialBuildOfLookupTable) {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Design & Architecture] suggestion

LOOKUP_VISIBLE_UNTIL_TICK is written in two structurally separate places with different reasoning: once inside the initial-build transaction (lines 595-598) and once inside the incremental pending-records transaction (lines 633-636). The split is correct but fragile — a future maintainer adding a third updateLookupTable code path (e.g. a partial rebuild) could easily forget to advance the watermark. The invariant 'advance LOOKUP_VISIBLE_UNTIL_TICK to currentTick at the end of every transaction that makes sync_lookup coherent up to that tick' is not enforced structurally. Consider a helper that wraps both LOOKUP_UP_TO_TICK and LOOKUP_VISIBLE_UNTIL_TICK writes together, so they can only be updated as a pair.

@review-hero

review-hero Bot commented Apr 30, 2026

Copy link
Copy Markdown

🦸 Review Hero Summary
12 agents reviewed this PR | 1 critical | 3 suggestions | 0 nitpicks | Filtering: consensus 3 voters, 4 below threshold, 1 suppressed

Below consensus threshold (4 unique issues not confirmed by majority)
Location Agent Severity Comment
packages/sync-server/src/__tests__/CentralSyncManager.syncLookup.test.ts:456 Bugs & Correctness suggestion The new test uses since = visibleUntilTick - 1 to create a one-tick window, relying on the assumption that visibleRelation's sync_lookup tick equals visibleUntilTick exactly (assigned by the T2 upd...
packages/sync-server/src/sync/CentralSyncManager.ts:105 Performance suggestion getSyncLookupVisibleUntilTick() performs a DB read on every prepareSession() call, but the LOOKUP_VISIBLE_UNTIL_TICK fact only changes at the end of each updateLookupTable() pass. Under concurrent ...
packages/sync-server/src/sync/CentralSyncManager.ts:263 Tupaia Conventions nitpick The comment '// if the sync_lookup table is enabled, don't allow syncs until it has finished its first update run' was removed from the isNullish(syncLookupUpToTick) guard. That comment explained a...
packages/sync-server/src/sync/CentralSyncManager.ts:393 Bugs & Correctness suggestion syncLookupVisibleUntilTick is read outside the repeatable-read snapshot transaction, then passed in as both pull_until and the snapshot upper bound. If updateLookupTable advances the watermark betw...
Local fix prompt (copy to your coding agent)

Fix these issues identified on the pull request. One commit per issue fixed.


packages/sync-server/src/sync/CentralSyncManager.ts:110: getSyncLookupVisibleUntilTick() returns Number.MAX_SAFE_INTEGER when LOOKUP_VISIBLE_UNTIL_TICK is not yet set. This value flows into pull_until (line 397) which is returned to the client as pullUntil (fetchPullMetadata) and then stored as the client's LAST_SUCCESSFUL_SYNC_PULL cursor. On the next sync, that client passes since=Number.MAX_SAFE_INTEGER; no real sync tick will ever exceed that, so the client never receives another record — sync is permanently broken for that device. The fallback comment says this 'can only happen before updateLookupTable has written the new watermark', but it ignores the upgrade scenario: an existing deployment already has LOOKUP_UP_TO_TICK set (from old code), so prepareSession passes the guard, but LOOKUP_VISIBLE_UNTIL_TICK is absent until the first post-upgrade updateLookupTable run. Any sync that races that window is corrupted. Fix: fall back to the LOOKUP_UP_TO_TICK value instead of Number.MAX_SAFE_INTEGER — it represents the same semantic ('last coherent lookup state') and is always a valid cursor bound.


packages/sync-server/src/sync/CentralSyncManager.ts:588: In the incremental update path, LOOKUP_UP_TO_TICK is written at tick T1 (first transaction) but LOOKUP_VISIBLE_UNTIL_TICK is written at tick T2 (second transaction, after a fresh tickTockGlobalClock call, so T2 = T1+1). Any record created in the T1→T2 window lands in sync_lookup only on the next updateLookupTable pass, but a delete trigger can write its tombstone into sync_lookup at tick ≤ T2, making the tombstone visible at the watermark while the corresponding create row is absent — exactly the race this PR fixes. The window is extremely narrow (milliseconds between two transactions) but structurally the same class of bug. Fix: derive LOOKUP_VISIBLE_UNTIL_TICK from T1 (reuse the tick stored into LOOKUP_UP_TO_TICK) rather than issuing a new tickTockGlobalClock in the second transaction, so the watermark matches the exact tick through which sync_lookup is coherent.


packages/sync-server/src/sync/snapshotOutgoingChanges.ts:17: until: number uses Number.MAX_SAFE_INTEGER as a sentinel meaning 'no upper bound'. This leaks the sentinel into every call site — three test call sites and two production call sites all have to pass Number.MAX_SAFE_INTEGER explicitly to opt out of bounding, and the SQL always emits AND updated_at_sync_tick &lt;= 9007199254740991 even when there is no real bound. An until?: number optional parameter with a conditional ${until != null ? 'AND updated_at_sync_tick &lt;= ?' : ''} SQL fragment (and only conditionally pushing the value into the params array) would make the 'unbounded' intent explicit at every call site and remove the no-op SQL clause.


packages/sync-server/src/sync/CentralSyncManager.ts:591: LOOKUP_VISIBLE_UNTIL_TICK is written in two structurally separate places with different reasoning: once inside the initial-build transaction (lines 595-598) and once inside the incremental pending-records transaction (lines 633-636). The split is correct but fragile — a future maintainer adding a third updateLookupTable code path (e.g. a partial rebuild) could easily forget to advance the watermark. The invariant 'advance LOOKUP_VISIBLE_UNTIL_TICK to currentTick at the end of every transaction that makes sync_lookup coherent up to that tick' is not enforced structurally. Consider a helper that wraps both LOOKUP_UP_TO_TICK and LOOKUP_VISIBLE_UNTIL_TICK writes together, so they can only be updated as a pair.

Base automatically changed from rn-1680-add-sync-tests to dev May 1, 2026 02:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant