fix garyx schedule_followup boundary fallback for deleted threads and dispatch retry by Binlogo · Pull Request #13 · Pyiner/garyx

Binlogo · 2026-06-01T08:25:51Z

Summary

schedule_followup schedules a cron InternalDispatch job that, on fire,
injects a synthetic user-turn into the originating thread. The happy path
worked, but boundary cases failed silently — and a failed one-shot followup
re-fired every tick forever (advance() only runs on Success, so a failed
Once job kept enabled=true with next_run in the past, which is_due()
treats as due every tick).

This adds explicit boundary fallback on the InternalDispatch trigger path.

Scope

JobRunStatus::FailedDropped (serde → "failed_dropped"): a terminal drop,
distinct from Failed.
Drop classification (FollowupAttemptError): thread deleted / missing
thread_id / missing app_state are non-retryable drops; other dispatch
errors are transient.
Bounded retry with exponential backoff (FOLLOWUP_MAX_RETRIES=3, base
200ms → 200/400/800ms) for transient failures; exhausting the budget drops
with the concrete error recorded in RunRecord.error.
CronJob::settle_after_run() unifies the run_now and tick post-run
blocks (single source of truth) and makes FailedDropped terminal: one-shot
jobs are disabled so a dropped followup never re-fires, and delete_after_run
is honored like Success.
Every drop path emits tracing::warn; the existing cron_job_completed
broadcast already carries status + reason for telemetry.

dispatch_internal_message_to_thread is unchanged (it already returns
Result and already errors on thread-not-found), so the restart_wake /
task_notifications / tasks callers are untouched.

Notes / limitations

A "thread stopped / user cancelled" state has no dedicated signal at dispatch
time, so a stopped-but-still-present thread will still receive the injected
turn (which starts a fresh turn). The FollowupAttemptError classifier is
extensible if such a signal is added later.
The retry backoff runs inline in the serial cron tick, so a retrying followup
can delay other due jobs by ≤~1.4s — consistent with the pre-existing serial
dispatch model and bounded.

Test plan

3 unit tests on the retry orchestrator: drop-without-retry, retry-then-success,
retry-exhausted (carries concrete error + correct attempt count).
1 integration test: a deleted thread yields status=failed_dropped with a
"thread not found" reason (asserts the serialized wire form too).
cargo test -p garyx-gateway --lib → 510 passed, 0 failed (existing
schedule_followup happy-path regression intact).

… dispatch retry schedule_followup schedules a cron InternalDispatch job that, on fire, injects a synthetic user-turn into the originating thread. The happy path worked, but boundary cases failed silently — and a failed one-shot followup re-fired every tick forever (advance() only runs on Success, so a failed Once job kept enabled=true with next_run in the past, which is_due() treats as due every tick). This adds explicit boundary fallback on the InternalDispatch trigger path: - New JobRunStatus::FailedDropped (serde -> "failed_dropped"): a terminal drop, distinct from Failed. - Drop classification (FollowupAttemptError): thread deleted / missing thread_id / missing app_state are non-retryable drops; other dispatch errors are transient. - Bounded retry with exponential backoff (FOLLOWUP_MAX_RETRIES=3, base 200ms) for transient failures; exhausting the budget drops with the concrete error recorded in RunRecord.error. - CronJob::settle_after_run() unifies the run_now and tick post-run blocks and makes FailedDropped terminal: one-shot jobs are disabled so a dropped followup never re-fires, and delete_after_run is honored like Success. - Every drop path emits tracing::warn; the existing cron_job_completed broadcast already carries status + reason for telemetry. dispatch_internal_message_to_thread is unchanged (it already returns Result and already errors on thread-not-found), so the restart_wake / task_notifications / tasks callers are untouched. Tests: 3 unit tests on the retry orchestrator (drop-no-retry, retry-then-success, retry-exhausted) + 1 integration test asserting a deleted thread yields status=failed_dropped with a "thread not found" reason. cargo test -p garyx-gateway --lib green (510 passed).

…lback

Binlogo force-pushed the feat/followup-fallback_0b0f50 branch from 7a842fc to 41064cf Compare June 1, 2026 08:34

Binlogo added 2 commits June 1, 2026 16:44

docs garyx document schedule_followup boundary handling and retry fal…

a28a264

…lback

Binlogo force-pushed the feat/followup-fallback_0b0f50 branch from c892e63 to a28a264 Compare June 1, 2026 08:47

Binlogo merged commit 0e2a0c5 into Pyiner:main Jun 1, 2026
1 check passed

Binlogo deleted the feat/followup-fallback_0b0f50 branch June 1, 2026 08:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix garyx schedule_followup boundary fallback for deleted threads and dispatch retry#13

fix garyx schedule_followup boundary fallback for deleted threads and dispatch retry#13
Binlogo merged 2 commits into
Pyiner:mainfrom
Binlogo:feat/followup-fallback_0b0f50

Binlogo commented Jun 1, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Binlogo commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Scope

Notes / limitations

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Binlogo commented Jun 1, 2026 •

edited

Loading