Skip to content

Runaway-source diagnostic for settle()/expect() fixpoint timeouts#25

Merged
mansbernhardt merged 2 commits into
mainfrom
claude/settle-runaway-diagnostic
Jun 20, 2026
Merged

Runaway-source diagnostic for settle()/expect() fixpoint timeouts#25
mansbernhardt merged 2 commits into
mainfrom
claude/settle-runaway-diagnostic

Conversation

@mansbernhardt

Copy link
Copy Markdown
Collaborator

Runaway-source diagnostic for settle()/expect() fixpoint timeouts

When a model never reaches a fixpoint under the executor-drive — a non-converging reactive cascade (node.forEach(Observed { … }) / node.onChange whose source emits a non-isSame value each evaluation, or sits in a feedback loop) — settle()/expect() correctly time out with "model never reached a fixpoint." Previously the diagnostic listed every active registration, hiding the offender (it took several debugging rounds to find the culprit in a real consumer — see the swift-model-104-…-settle-hang handoff thread on the parallel-apple side).

Now the drive counts per-call-site reactive-body deliveries and, on a timeout, prepends a callout naming the registration that fired far more than a one-shot AND was still firing at the timeout — the non-isSame/feedback source — with fix guidance:

settle() timed out: model never reached a fixpoint (deadlock or runaway).
  ⚠️ likely runaway: EditorModel: "setupContextPropagation() @ …:NNN" fired 984716× and was still firing at the timeout.
     A reactive source (node.forEach / node.onChange) that keeps firing never reaches a
     fixpoint. It almost certainly emits a non-isSame value each evaluation, or sits in a
     feedback loop (a write that re-triggers it). Make the emitted value isSame/Equatable-
     stable, or break the cycle. See Docs/test-determinism-executor-drain.md.

Properties

  • Covers forEach and onChange (onChange routes through forEach and forwards the user's source location).
  • Zero production cost — counting is gated through ModelAccess.reactiveBodyFired (a no-op when ModelAccess.current is nil or a non-test access), so it runs only under .modelTesting.
  • No snapshot churn — non-runaway stalls (genuine deadlock / parked task) produce byte-identical output; the callout is prepended only when a runaway is detected.
  • Diagnostic-only — the 120s watchdog and all settle()/expect() semantics are unchanged.

Implementation

  • ModelAccess.reactiveBodyFired(_:) — no-op default; TestAccess overrides to count per FileAndLine (count, lastFireNs) under a dedicated lock.
  • _forEachImpl increments on each delivery in both loop paths (cancelPrevious true/false).
  • settleDiagnostics() prepends _runawayDiagnosticLine() when one call site fired ≥ a threshold and was still firing within the last second.

Validation

  • A non-isSame→non-Equatable-env cascade now reports ⚠️ likely runaway: … fired 984716× and was still firing at the timeout (manually verified — a true runaway hangs to the 120s watchdog by design, so this isn't a CI test).
  • Full suite green locally (scripts/test), no regression from the hot-path hook.

Targets 1.0.5. CHANGELOG + design-note Update 26 included.

🤖 Generated with Claude Code

mansbernhardt and others added 2 commits June 19, 2026 14:58
When a model never reaches a fixpoint (a non-converging reactive cascade — a
node.forEach(Observed{}) / node.onChange whose source emits a non-isSame value
each evaluation, or sits in a feedback loop), settle()/expect() correctly time
out, but the diagnostic listed EVERY active registration, hiding the offender
(see the parallel-apple swift-model-104-…-settle-hang handoff thread — it took
several debugging rounds to find the culprit).

Now the drive counts per-call-site reactive-body deliveries and, on a timeout,
prepends a callout naming the registration that fired far more than a one-shot
AND was still firing at the timeout — the non-isSame/feedback source — with fix
guidance. Covers both forEach and onChange (onChange routes through forEach and
forwards the user's source location).

- ModelAccess.reactiveBodyFired(_:) — no-op default; gated so counting is zero
  cost outside .modelTesting (production ModelAccess.current is nil/non-test).
- TestAccess records per-FileAndLine (count, lastFireNs) under a dedicated lock.
- _forEachImpl increments on each delivery in both loop paths.
- settleDiagnostics() prepends the runaway callout; non-runaway stalls (genuine
  deadlock / parked task) keep identical output (no snapshot churn).

Diagnostic-only — the 120s watchdog and all settle semantics are unchanged.
Verified: a non-isSame→non-Equatable-env cascade now reports
"⚠️ likely runaway: … fired 984716× and was still firing at the timeout."
Full suite green (no regression from the hot-path hook). CHANGELOG + design-note
Update 26. Targets 1.0.5.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@mansbernhardt mansbernhardt merged commit d6f711b into main Jun 20, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant