Skip to content

Release mutex per attachment, persist incrementally, allow stopSync to interrupt mid-batch#415

Merged
khawarizmus merged 3 commits into
mainfrom
attachments-interrupts
May 13, 2026
Merged

Release mutex per attachment, persist incrementally, allow stopSync to interrupt mid-batch#415
khawarizmus merged 3 commits into
mainfrom
attachments-interrupts

Conversation

@khawarizmus
Copy link
Copy Markdown
Contributor

Problem

Reported on Discord (thread).

  1. startSync's asyncMap wraps the entire per-attachment loop in attachmentsService.withContext(...), so the attachment-service mutex is held for the whole batch.

  2. handleSync iterates every active attachment serially with await, doing network I/O for each one inside that held mutex.

  3. State changes are accumulated in a local list and committed once via a single context.saveAttachments(updatedAttachments) call at the end of the iteration.

  4. Cancellation (stopSync) only re-evaluates the takeWhile predicate between batches so it cannot interrupt a running asyncMap step.

With ~20k queued downloads at ~200 ms each (the case in the report), it produces three observable failures:

  • stopSync() can't complete until the in-flight asyncMap step finishes, i.e. until the full batch (~hours) drains.
  • saveFile(), deleteFile(), clearQueue(), expireCache(), and _processWatchedAttachments() all call withContext(...) and are blocked behind the same mutex for the duration of the batch.
  • Any consumer query against attachments_queue (e.g. a diagnostics page running SELECT state, COUNT(*) ... GROUP BY state) sees zero progress until the batch finishes and then a single atomic jump.

Expected behavior

  • stopSync() returns within one attachment's processing time, not one batch's.
  • saveFile(), deleteFile(), and other queue mutations are not blocked behind an in-flight sync batch.
  • attachments_queue reflects real-time progress, so consumer queries and db.watch(...) streams see incremental state changes instead of an end-of-batch commit.

AI disclosure

This PR was created with the help of Opus 4.7 and Claude Code after reproducing the issue locally, and I’ve reviewed the resulting code changes.

Copy link
Copy Markdown
Contributor

@simolus3 simolus3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with these changes, thanks 👍 I was concerned about this potentially causing race conditions if we implicitly relied on the mutex being used everywhere, but since SyncingService is essentially an actor thanks to the asyncMap setup, this looks good to me.

My comments are mostly nits or related to the tests: IMO, we should avoid relying on magic Future.delayed calls as much as possible since that makes tests less reliable (I see that one test became flaky in the initial CI run for this PR for instance).

Comment thread packages/powersync/lib/src/attachments/sync/syncing_service.dart Outdated
Comment thread packages/powersync/lib/src/attachments/sync/syncing_service.dart Outdated
Comment thread packages/powersync/test/attachments/attachment_test.dart Outdated
Comment thread packages/powersync/test/attachments/attachment_test.dart Outdated
Comment thread packages/powersync/test/attachments/attachment_test.dart Outdated
Comment thread packages/powersync/test/attachments/attachment_test.dart Outdated
@khawarizmus
Copy link
Copy Markdown
Contributor Author

Thank you for the review @simolus3

I have updated the PR. Please let me know what you think

Copy link
Copy Markdown
Contributor

@simolus3 simolus3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@khawarizmus khawarizmus merged commit b8bdd1a into main May 13, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants