Release mutex per attachment, persist incrementally, allow stopSync to interrupt mid-batch#415
Merged
Merged
Conversation
…, allow stopSync to interrupt mid-batch
simolus3
reviewed
May 11, 2026
Contributor
simolus3
left a comment
There was a problem hiding this comment.
I'm happy with these changes, thanks 👍 I was concerned about this potentially causing race conditions if we implicitly relied on the mutex being used everywhere, but since SyncingService is essentially an actor thanks to the asyncMap setup, this looks good to me.
My comments are mostly nits or related to the tests: IMO, we should avoid relying on magic Future.delayed calls as much as possible since that makes tests less reliable (I see that one test became flaky in the initial CI run for this PR for instance).
…improve test assertions
Contributor
Author
|
Thank you for the review @simolus3 I have updated the PR. Please let me know what you think |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Reported on Discord (thread).
startSync'sasyncMapwraps the entire per-attachment loop inattachmentsService.withContext(...), so the attachment-service mutex is held for the whole batch.handleSynciterates every active attachment serially withawait, doing network I/O for each one inside that held mutex.State changes are accumulated in a local list and committed once via a single
context.saveAttachments(updatedAttachments)call at the end of the iteration.Cancellation (
stopSync) only re-evaluates thetakeWhilepredicate between batches so it cannot interrupt a runningasyncMapstep.With ~20k queued downloads at ~200 ms each (the case in the report), it produces three observable failures:
stopSync()can't complete until the in-flightasyncMapstep finishes, i.e. until the full batch (~hours) drains.saveFile(),deleteFile(),clearQueue(),expireCache(), and_processWatchedAttachments()all callwithContext(...)and are blocked behind the same mutex for the duration of the batch.attachments_queue(e.g. a diagnostics page runningSELECT state, COUNT(*) ... GROUP BY state) sees zero progress until the batch finishes and then a single atomic jump.Expected behavior
stopSync()returns within one attachment's processing time, not one batch's.saveFile(),deleteFile(), and other queue mutations are not blocked behind an in-flight sync batch.attachments_queuereflects real-time progress, so consumer queries anddb.watch(...)streams see incremental state changes instead of an end-of-batch commit.AI disclosure
This PR was created with the help of Opus 4.7 and Claude Code after reproducing the issue locally, and I’ve reviewed the resulting code changes.