Range-complete acked tasks on queue Stop#10326
Open
prathyushpv wants to merge 1 commit into
Open
Conversation
bade8e1 to
8071852
Compare
ed37dc2 to
9730acd
Compare
queueBase.Stop now does a best-effort doCheckpoint(true) that range- deletes already-acked task rows. Without this, the next shard owner reloads stale task rows from the gap between the last periodic checkpoint and shutdown and re-dispatches them — most visible on the transfer queue. The shard state write is intentionally skipped on the shutdown path: by the time engine.Stop runs from FinishStop, the shard is already Stopped and updateShardInfo would reject it. The next owner instead relies on the rangeCompleteTasks deletion — its pagination over the stale reader-scope range returns empty for deleted rows, and the slice compacts on the first post-startup checkpoint.
9730acd to
580b598
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changed?
queueBase.Stopnow runs a best-effortdoCheckpoint(true)that range-deletes already-acked task rows viaRangeCompleteHistoryTasks. The shard-state write (UpdateShard) is intentionally skipped — by the timeengine.Stopruns fromFinishStop, the shard is already Stopped andupdateShardInfowould reject it anyway.Why?
Without this, the next shard owner reloads the task rows acked between the last periodic checkpoint and shutdown, and re-dispatches them. Most visible on the transfer queue (high throughput, exact-key deletion). Deleting the rows is enough — the next owner's pagination returns empty for the stale reader-scope range, the slice compacts on the first post-startup checkpoint, and no tasks are reprocessed.
How did you test it?