Severity: low / enhancement — opt-in performance experiments, both per-reactor and orthogonal to the loop structure.
Two ring-level knobs worth exposing behind ServerConfig flags for A/B benchmarking:
1. NAPI busy-polling (kernel 6.7+)
IORING_REGISTER_NAPI enables per-ring NAPI busy-polling: the reactor burns CPU polling the NIC's completion path instead of sleeping in interrupts, cutting recv-path p99 latency. Natural fit for the dedicated-cores deployment model this runtime already assumes; pointless (and costly) for shared hosts, hence opt-in.
2. Min-wait batched completions (kernel 6.12+)
The loop currently parks with SubmitAndWait(1) — wake on the first CQE. The 6.12 min-timeout wait (liburing io_uring_submit_and_wait_min_timeout; IORING_ENTER_EXT_ARG + getevents-arg plumbing) expresses "wake when ≥ N CQEs are ready, or after τ, whichever first": bounded added latency traded for larger completion batches per tick under moderate load. Under saturation batches are naturally large so it changes nothing; the interesting regime is the middle of the load curve.
Both are a config flag + a few lines in Ring/the loop, and a wrk sweep (RPS + p50/p99 across connection counts) would show whether either earns default-on. Incremental mode already requires 6.12, so the min-wait floor is not a new constraint there.
Severity: low / enhancement — opt-in performance experiments, both per-reactor and orthogonal to the loop structure.
Two ring-level knobs worth exposing behind
ServerConfigflags for A/B benchmarking:1. NAPI busy-polling (kernel 6.7+)
IORING_REGISTER_NAPIenables per-ring NAPI busy-polling: the reactor burns CPU polling the NIC's completion path instead of sleeping in interrupts, cutting recv-path p99 latency. Natural fit for the dedicated-cores deployment model this runtime already assumes; pointless (and costly) for shared hosts, hence opt-in.2. Min-wait batched completions (kernel 6.12+)
The loop currently parks with
SubmitAndWait(1)— wake on the first CQE. The 6.12 min-timeout wait (liburingio_uring_submit_and_wait_min_timeout;IORING_ENTER_EXT_ARG+ getevents-arg plumbing) expresses "wake when ≥ N CQEs are ready, or after τ, whichever first": bounded added latency traded for larger completion batches per tick under moderate load. Under saturation batches are naturally large so it changes nothing; the interesting regime is the middle of the load curve.Both are a config flag + a few lines in
Ring/the loop, and a wrk sweep (RPS + p50/p99 across connection counts) would show whether either earns default-on. Incremental mode already requires 6.12, so the min-wait floor is not a new constraint there.