time: auto-advance to slot start, not exact deadline#19
Merged
antsujay merged 1 commit intoJun 18, 2026
Merged
Conversation
The nanosecond-wheel commit (dfdc361) changed the paused-clock auto-advance target from the next timer's wheel-slot start to its exact deadline (wheel.next_when()), so process_at_time would fire it in one park pass instead of one pass per wheel level. That collapses away an interleave the IO path depends on. park_thread_timeout's auto-advance branch first does a zero-timeout IO poll, then advances if !did_wake(). Tasks woken by IO readiness during that poll go through the local-queue scheduling path -- park_internal runs the driver inside Context::enter, so the scheduler core is in the thread-local for the duration -- and the local-queue path does not call driver.unpark(), so did_wake stays false. With the slot-start target an upper-level timer cascades one level without firing and control returns to the run loop, which runs the IO-woken task; with the exact target the timer fires in the same pass. Under start_paused, a timeout(D, io_op) whose io_op is satisfied by a sibling task on the same runtime therefore always loses to its own timeout. Restore the slot-start target (the duration the caller already computed). Level-0 slots are 1ns under test-util, so the final cascade still lands on the exact deadline and the precision guarantees are unchanged; the cost is up to NUM_LEVELS park iterations per far timer, each a non-blocking single-thread poll. next_when() remains in use for quiesce resolution and next_timer reporting, where the drain-park hook has already re-checked for runnable work. New regression test time_paused_io_race pins the sibling-IO case.
antsujay
approved these changes
Jun 18, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Under
start_paused = true, atimeout(dur, io_op)whoseio_opis satisfied by a sibling task on the same runtime always lost to its own timeout: paused-clock auto-advance jumped the clock straight to the timeout's exact deadline before the IO-woken sibling ever got a chance to run.Before (with the nanosecond-wheel change, dfdc361): the new
paused_timeout_yields_to_same_runtime_iotest fails — the clock lands on the 60s deadline even though a peer task on the same runtime is ready to satisfy the IO.After: the IO completes and the clock stays below the deadline.
Why: the auto-advance veto (
did_wake) does not observe IO readiness delivered during the zero-timeout poll. Those wakes take the local-queue scheduling path (the scheduler core is in-context for the duration of the park) and never calldriver.unpark(), sodid_wakestays false. dfdc361 made auto-advance target the next timer's exact deadline (wheel.next_when()) so it would fire in a single park pass, which removed the interleave the IO path depends on.This change restores the slot-start target — the duration the caller already computed, matching upstream behavior. An upper-level timer now cascades one level per park pass and control returns to the run loop in between, so IO-woken tasks run before the clock can reach the deadline. Precision is unchanged: level-0 slots are 1ns under
test-util, so the final cascade still lands exactly on the deadline. The cost is up to NUM_LEVELS non-blocking park iterations per timer, confined to the paused current_thread auto-advance path.next_when()remains in use for quiesce resolution andnext_timerreporting.Also bumps the version to
1.52.10003+anthropicper the fork's release convention.Test plan: the new regression test
time_paused_io_racefails on the parent commit and passes with this change; the existing time, pause, quiesce, sleep, timeout, and interval integration suites pass locally with--features full,test-util.