Ephemerons: wait on all domains to be done marking before asking for another round#14774
Open
gasche wants to merge 1 commit into
Open
Ephemerons: wait on all domains to be done marking before asking for another round#14774gasche wants to merge 1 commit into
gasche wants to merge 1 commit into
Conversation
eutro
approved these changes
Apr 29, 2026
Contributor
eutro
left a comment
There was a problem hiding this comment.
The CAS loop looks sound to me, and I see why it is necessary instead of just checking caml_atomic_counter_decr(&num_domains_to_mark) == 0 (although we miss a debug assert that domains_still_marking > 0).
Comment on lines
350
to
351
Contributor
There was a problem hiding this comment.
It sounds like this comment is no longer true?
Member
Author
There was a problem hiding this comment.
Thanks! I fixed the comment.
b41950f to
560029c
Compare
560029c to
c21e04c
Compare
… a new ephemeron round
c21e04c to
cefbed6
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Ephemeron marking must be done after normal marking, and may in turn reveal more normal marking work, which in turn requires a new round of ephemeron marking (marking all "TODO" ephemerons again to check if their keys have been marked), etc., until a fixpoint is reached as no additional normal-marking happens anymore.
On a single-domain system there is an obvious good strategy: wait until all normal-marking work is done, and then do ephemeron marking, and repeat. On multi-domain systems we only know when we (the current domain) are done marking. The trunk code asks for a new ephemeron round (on all domains) whenever a domain is done marking; this can lead to useless repetition of ephemeron-marking work: if two domains finish marking one after the other, each will ask all domains to do a round of ephemeron-marking.
This PR changes the end-of-normal-marking logic to ensure that we ask for a new round of epehmeron marking when all domains are done marking, rather than whenever the current domain is done marking.
@OlivierNicole and @damiendoligez reviewed an earlier version of this branch yesterday and we found a concurrency bug together. The new version uses a CAS loop to avoid concurrency issues, as we discussed.