[fix][broker] Prevent backlog quota cursor updates during topic close#25914
[fix][broker] Prevent backlog quota cursor updates during topic close#25914Denovo1998 wants to merge 2 commits into
Conversation
|
Thanks for the follow-up. I think adding the second check before One thing I think we should decide is whether this PR is intended to be a best-effort mitigation or a strict guarantee. With the current code, there is still a small window:
So this change can reduce the race window a lot, but it cannot completely guarantee that cursor mutation never happens during topic close/delete. If best-effort mitigation is enough for this issue, I think the current direction is reasonable. If the goal is to strictly prevent cursor mutation during teardown, then we may need stronger coordination with the topic close/delete path, instead of only checking |
| break; | ||
| } | ||
| beforeBacklogQuotaCursorMutation(persistentTopic); | ||
| if (shouldStopEvictionOnTopicClose(persistentTopic)) { |
There was a problem hiding this comment.
This check is useful because it is close to the actual cursor mutation point.
The remaining subtle case is:
- this check returns false
- topic close/delete starts right after that
- eviction still proceeds to
skipEntries/markDelete
So I think we should clarify whether this is intended as a best-effort race reduction, or whether we need stronger coordination to strictly avoid cursor mutation once topic close/delete starts.
|
Thanks, that is a good point. The previous version was best-effort only. I updated the code to use the existing |
Follow-up to #25684
Motivation
BacklogQuotaManageralready skips backlog quota handling when a topic is fenced or closing/deleting. However, checking the topic state only before entering the eviction path is not enough: a topic can begin close/delete after backlog quota eviction has started but before eviction mutates the slowest cursor.This can let backlog quota eviction call
skipEntriesormarkDeletewhile the topic teardown path is starting, which is the race this PR is intended to harden.Modifications
PersistentTopic.isClosingOrDeleting().ManagedCursor#skipEntries.ManagedCursor#markDelete.markDeletePositionMoveForward()inside the guarded mutation.Verifying this change
This change added tests and can be verified as follows:
BacklogQuotaManagerTest.testSizeBacklogEvictionRaceWithTopicCloseDoesNotSkipEntriesBacklogQuotaManagerTest.testTimeBacklogEvictionRaceWithTopicCloseDoesNotMarkDeleteVerified locally with:
Does this pull request potentially affect one of the following parts:
If the box was checked, please highlight the changes