Skip to content

[snapshot-regression-fix] lp: remove dio_calls_period linear recovery that defeats dio backoff#9976

Draft
levnach wants to merge 1 commit into
masterfrom
snapshot-fix-iss-8418-dio-period-b4b2899e399f3084
Draft

[snapshot-regression-fix] lp: remove dio_calls_period linear recovery that defeats dio backoff#9976
levnach wants to merge 1 commit into
masterfrom
snapshot-fix-iss-8418-dio-period-b4b2899e399f3084

Conversation

@levnach

@levnach levnach commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes a snapshot-regression divergence reported in Z3Prover/bench discussion #2709 (snapshot-regression: iss-8418/3.smt2 diverged).

  • Benchmark: iss-8418/3.smt2 (a quantifier-free nonlinear mixed Int/Real instance; corpus path inputs/issues/iss-8418/ in Z3Prover/bench)
  • z3 under test: z3-4.17.0-x64-glibc-2.39 (Nightly)
  • Divergence: the recorded oracle expects sat, current z3 HEAD times out at the 20s per-file budget.
--- 3.expected.out (expected)
+++ produced (current z3)
@@ -1 +1 @@
-sat
+timeout

Root cause

Bisected to commit 57fb71900 "lp: gate Gomory-with-dio on genuine dio failures; separate config from runtime state (#9958)" — reverting that commit restores sat.

The proximate cause is a linear "recovery" of dio_calls_period that fights the Diophantine handler's exponential backoff:

  • int_solver::solve_dioph_eq() doubles dio_calls_period on every unproductive (undef) dio call — explicitly "throttle dio scheduling on failure". This geometric backoff is what lets the solver escalate to Gomory cuts / branching once dio is persistently unproductive (and, after dio_gomory_enable_period undefs, start running Gomory alongside dio).
  • int_solver::should_solve_dioph_eq() additionally decreased dio_calls_period by a fixed amount (dio_calls_period_decrease, default 2) on every final_check() where dio did not fire.

Once the period grows, non-firing calls vastly outnumber firing calls, so the linear per-call decrease overwhelms the geometric backoff and pins the period near its initial small value. The result is that a persistently-unproductive dio keeps firing and the intended hand-off to Gomory/branching never durably engages. On iss-8418/3.smt2, where dio is unproductive on this nonlinear instance, the solver spins until the 20s timeout instead of returning sat.

Fix

Remove the linear-recovery block in should_solve_dioph_eq() and the now-unused dio_calls_period_decrease parameter (plus its setting field/accessors/wiring). This restores the intended exponential backoff so dio frequency decays on persistent failure.

The two other heuristics introduced by 57fb71900 are intentionally left unchanged:

  • the dio-before-Gomory ordering in check(), and
  • the dio_gomory_enable_period gating that starts Gomory after persistent dio failure.

Net change: 4 files, 10 deletions, no additions.

Validation (rebuilt z3 + re-ran the benchmark — step 5)

Built the ./z3 checkout at HEAD 6a62a531 (./configure && make -C build) and ran the failing benchmark with the same options the snapshot capture uses (empty z3_opts, -T:20):

  • Before fix (HEAD): timeout at 20.00s (reproduced the divergence).
  • After fix (rebuilt): sat in ~0.03s — matches the recorded 3.expected.out oracle, well within the 20s budget.

Notes / uncertainty (please review)

  • This forgoes the linear dio_calls_period recovery heuristic that 57fb71900 introduced; that commit reported it contributed some additional QF_LIA solves in aggregate. Validation here is on the single regressing benchmark, not the full QF_LIA suite, so a maintainer should confirm there is no material aggregate regression.
  • An alternative, even narrower fix would be to revert only the dio-before-Gomory ordering (also restores sat, in ~0.22s). Removing the throttle-defeating recovery was chosen instead because it preserves the author's intended ordering/gating and addresses the actual mechanism (a heuristic working against the documented backoff). Opened as a draft for human review.

Originating discussion: https://github.com/Z3Prover/bench/discussions/2709

Generated by Fix a Z3 snapshot-regression divergence · 3.1K AIC · ⌖ 89.6 AIC · ⊞ 41.2K ·

The Diophantine-handler scheduler in int_solver applies an exponential
backoff: solve_dioph_eq() doubles dio_calls_period on every unproductive
(undef) dio call to "throttle dio scheduling on failure", letting the
solver escalate to Gomory cuts / branching once dio is persistently
unproductive.

should_solve_dioph_eq() additionally decreased dio_calls_period by a
fixed amount (dio_calls_period_decrease, default 2) on every final_check
where dio did not fire. Because non-firing calls vastly outnumber firing
calls once the period grows, this linear per-call decrease overwhelms the
geometric backoff and pins the period near its initial small value, so an
unproductive dio keeps firing and the handoff to Gomory/branching never
durably engages.

On the nonlinear mixed Int/Real benchmark iss-8418/3.smt2 this causes a
20s timeout where the instance was previously solved as sat. Removing the
linear recovery (and the now-unused dio_calls_period_decrease parameter)
restores the intended backoff; the benchmark is solved sat in ~0.03s. The
dio-before-Gomory ordering and the dio_gomory_enable_period gating are
left unchanged.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant