Skip to content

time: auto-advance to slot start, not exact deadline#19

Merged
antsujay merged 1 commit into
anthropics:anthropic-1.52.3from
bobbyp-ant:bobbyp/auto-advance-slot-start
Jun 18, 2026
Merged

time: auto-advance to slot start, not exact deadline#19
antsujay merged 1 commit into
anthropics:anthropic-1.52.3from
bobbyp-ant:bobbyp/auto-advance-slot-start

Conversation

@bobbyp-ant

Copy link
Copy Markdown

Under start_paused = true, a timeout(dur, io_op) whose io_op is satisfied by a sibling task on the same runtime always lost to its own timeout: paused-clock auto-advance jumped the clock straight to the timeout's exact deadline before the IO-woken sibling ever got a chance to run.

Before (with the nanosecond-wheel change, dfdc361): the new paused_timeout_yields_to_same_runtime_io test fails — the clock lands on the 60s deadline even though a peer task on the same runtime is ready to satisfy the IO.
After: the IO completes and the clock stays below the deadline.

Why: the auto-advance veto (did_wake) does not observe IO readiness delivered during the zero-timeout poll. Those wakes take the local-queue scheduling path (the scheduler core is in-context for the duration of the park) and never call driver.unpark(), so did_wake stays false. dfdc361 made auto-advance target the next timer's exact deadline (wheel.next_when()) so it would fire in a single park pass, which removed the interleave the IO path depends on.

This change restores the slot-start target — the duration the caller already computed, matching upstream behavior. An upper-level timer now cascades one level per park pass and control returns to the run loop in between, so IO-woken tasks run before the clock can reach the deadline. Precision is unchanged: level-0 slots are 1ns under test-util, so the final cascade still lands exactly on the deadline. The cost is up to NUM_LEVELS non-blocking park iterations per timer, confined to the paused current_thread auto-advance path. next_when() remains in use for quiesce resolution and next_timer reporting.

Also bumps the version to 1.52.10003+anthropic per the fork's release convention.

Test plan: the new regression test time_paused_io_race fails on the parent commit and passes with this change; the existing time, pause, quiesce, sleep, timeout, and interval integration suites pass locally with --features full,test-util.

The nanosecond-wheel commit (dfdc361) changed the paused-clock
auto-advance target from the next timer's wheel-slot start to its exact
deadline (wheel.next_when()), so process_at_time would fire it in one
park pass instead of one pass per wheel level. That collapses away an
interleave the IO path depends on.

park_thread_timeout's auto-advance branch first does a zero-timeout IO
poll, then advances if !did_wake(). Tasks woken by IO readiness during
that poll go through the local-queue scheduling path -- park_internal
runs the driver inside Context::enter, so the scheduler core is in the
thread-local for the duration -- and the local-queue path does not call
driver.unpark(), so did_wake stays false. With the slot-start target an
upper-level timer cascades one level without firing and control returns
to the run loop, which runs the IO-woken task; with the exact target
the timer fires in the same pass. Under start_paused, a
timeout(D, io_op) whose io_op is satisfied by a sibling task on the
same runtime therefore always loses to its own timeout.

Restore the slot-start target (the duration the caller already
computed). Level-0 slots are 1ns under test-util, so the final cascade
still lands on the exact deadline and the precision guarantees are
unchanged; the cost is up to NUM_LEVELS park iterations per far timer,
each a non-blocking single-thread poll. next_when() remains in use for
quiesce resolution and next_timer reporting, where the drain-park hook
has already re-checked for runnable work. New regression test
time_paused_io_race pins the sibling-IO case.
@antsujay antsujay merged commit 068fe5b into anthropics:anthropic-1.52.3 Jun 18, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants