Skip to content

feat(mongodb-storage)!: chunked multi-op bucket documents with range-merging compaction and invariant tests#617

Open
Sleepful wants to merge 63 commits into
mainfrom
compressed-bucket-storage
Open

feat(mongodb-storage)!: chunked multi-op bucket documents with range-merging compaction and invariant tests#617
Sleepful wants to merge 63 commits into
mainfrom
compressed-bucket-storage

Conversation

@Sleepful
Copy link
Copy Markdown
Collaborator

@Sleepful Sleepful commented Apr 29, 2026

Summary

Replaces MongoDB bucket storage's single-operation-per-document model with chunked multi-operation documents. Operations are now grouped into BSON documents by a ~1MB data-size threshold, reducing document count and index overhead for workloads with many small rows. The change includes range-merging compaction (rebuild from survivors instead of in-place mutation), document-level checksum aggregation, and a comprehensive edge-case test suite verifying data integrity invariants.

This is a breaking change for existing MongoDB storage deployments — databases using the previous single-op document format are not compatible. No migration path is provided.

What Changed

1. Collapse Dual-Version Abstraction

During development, two document formats coexisted behind an abstraction layer. This PR removes the abstraction and all code for the discarded format, leaving a single direct implementation.

Deleted:

  • v5/ directory and all adapter files (was the alternate/new format during development)
  • document-formats/v3-format.ts — single-op format code
  • document-formats/format-interface.ts — dual-format abstraction interface
  • common/MongoSyncBucketStorageCallbacks.ts — callback indirection layer
  • v3/models.ts and v5/models.ts re-export layers
  • VersionedPowerSyncMongo wrappers — storage now uses PowerSyncMongo directly

Renamed:

  • document-formats/v5-format.tsdocument-formats/bucket-document-format.ts
  • BucketDataDocumentV5BucketDataDocument
  • BucketOperationV5BucketOperation

Architecture before:

AbstractMongoSyncBucketStorage
  └── MongoSyncBucketStorage (concrete, delegates via callbacks)
        └── MongoSyncBucketStorageV3 / V5 (thin adapters)

Architecture after:

AbstractMongoSyncBucketStorage
  └── MongoSyncBucketStorageV3 (concrete, direct implementation)

2. Chunked Multi-Op Document Format

The previous model stored exactly one operation per MongoDB document. For workloads with many small rows, this created excessive document and index overhead.

New document shape: BucketDataDocument stores an ops[] array plus aggregated metadata:

  • _id.o = maximum op_id in the document (used for range queries)
  • min_op = minimum op_id
  • count = number of operations
  • checksum = sum of operation checksums
  • size = total byte size of operation data
  • target_op = maximum target_op across operations

Chunking: The write path groups pending operations by bucket, then chunks them into documents by a 1MB data-size threshold. Each chunk becomes one BucketDataDocument. Single-operation chunks remain valid.

Read path: getBucketDataBatch() queries by _id.o range, then post-filters individual operations within partially overlapping documents. Operations outside (start, checkpoint] are skipped.

Compaction: Instead of modifying documents in-place (previously PUT→MOVE, collapse to CLEAR), the compactor now takes a "rebuild from survivors" approach:

  1. Read all documents in a bucket
  2. Load and expand all operations
  3. Filter superseded PUT/REMOVE operations (newest-to-oldest deduplication by table/row_id/source)
  4. Preserve MOVE and CLEAR operations unconditionally
  5. Re-chunk surviving operations by the same 1MB threshold
  6. Replace old documents with new chunked documents in a transaction

Checksums: computePartialChecksumsForCollection() uses the pre-computed document-level checksum aggregate for fully-included documents. Only partially-included documents fall back to iterating individual operations.

Glossary

Fully included document: min_op > start. Example: document covers [40, 60], client asks for (30, 55]. Since min_op=40 > 30, every op in this document is within the client's range. The pipeline uses the pre-computed checksum field on the document — no need to iterate individual ops.

Partially included document: min_op <= start. Example: document covers [40, 60], client asks for (45, 55]. Since min_op=40 <= 45, some ops at the beginning of the document (40, 45) are outside the range. The pipeline can't use the pre-computed checksum — it must filter individual ops in the ops[] array and sum only those with o > start.

3. Edge Case Hardening & Invariant Tests

Comprehensive test suite verifying data integrity invariants under boundary conditions:

Read Filtering Boundaries (storage_sync.test.ts) — 13 test cases covering all combinations of start and checkpoint positions relative to document boundaries:

  • Full range, exact boundaries, mid-document filters, gap-only ranges, zero-width ranges, beyond-all-docs ranges

Compaction Boundaries (storage_compacting.test.ts) — 8 test cases:

  • Superseded ops removed from middle/first/last documents
  • All ops superseded → empty bucket
  • Single surviving op per document
  • Multiple small survivors merged by rechunking
  • Same row_id spanning document boundaries
Glossary

Rechunking is the process of grouping the surviving ops into new documents using chunkBucketData() — the same function used during normal writes. It groups ops by data size (1MB threshold), creating as many new documents as needed.

Invariant Verification Tests (storage_compacting.test.ts) — 19 unit + integration tests:

  1. ops[] ordering preserved after serialization and compaction
  2. Range metadata consistency (_id.o = max_op, min_op = min_op, count = ops.length, checksum = sum(op.checksum), size = sum(data.length))
  3. target_op correctness (max of non-null target_op values)
  4. No overlapping ranges between documents
  5. Post-query filtering correctness (covered by read filtering matrix)
  6. Compaction survivor integrity (PUT/REMOVE deduplication, MOVE/CLEAR preservation)
  7. Empty document cleanup (documents with no surviving ops deleted)
  8. BSON limit safety (large ops split, oversized single op gets own chunk)
  9. Serialization fidelity (null data, empty strings, unicode preserved)
  10. Document _id.o invariant (equals max op in document)
  11. Checksum consistency (aggregation pipeline matches JavaScript addChecksums)
  12. Compaction with maxOpId filtering (ops above limit excluded)

Breaking Changes

MongoDB storage: Existing deployments using the previous single-operation-per-document format are not compatible with this change. This requires a fresh deployment or manual migration (not provided).

V1 storage is unaffected.

Test Results

All existing parameterized tests continue to pass. New edge-case tests pass with no regressions.

# module-mongodb-storage
pnpm --filter='./modules/module-mongodb-storage' test -- --run
→ all pass

Key Files Changed

Detailed description per file

Files Changed

.changeset/

  • wild-pears-sing.md — Breaking changeset for service-core and module-mongodb-storage for the chunked multi-op document format.

modules/module-mongodb-storage/src/storage/

  • MongoBucketStorage.ts — Factory and lifecycle methods for V3 storage, providing direct instantiation without version dispatch.
  • storage-index.ts — Re-exports updated for new shared modules (common/models.ts, bucket-operations/*) and consolidated type names.

modules/module-mongodb-storage/src/storage/implementation/

Core storage layer. The abstract base class and shared infrastructure live here; V1 and V3 specifics are in their respective subdirectories.

  • AbstractMongoSyncBucketStorage.ts — Abstract base class with shared storage logic. V3 implements this directly without callback indirection.
  • createMongoSyncBucketStorage.ts — Factory that instantiates MongoSyncBucketStorageV3 for V3 storage.
  • db.tsversioned() factory returning the appropriate VersionedPowerSyncMongo per storage version.
  • MongoBucketBatch.ts — Thin base class with the common batch interface and fields. Write-path logic lives in V1 and V3 subclasses.
  • MongoChecksums.ts — Shared checksum infrastructure; imports from common/models.ts.
  • MongoCompactor.ts — Shared compaction base with range-merging scaffolding. Sets target_op during MOVE and CLEAR phases.
  • MongoParameterCompactor.ts — Concrete parameter compactor with default collectionFilter() and deleteFilter() implementations. Used directly by both V1 and V3.
  • MongoPersistedSyncRulesContent.ts — Sync rules persistence using shared VersionedPowerSyncMongo collection accessors.
  • models.ts — Top-level implementation models. Shared types moved to common/models.ts and document-formats/bucket-document-format.ts.

modules/module-mongodb-storage/src/storage/implementation/bucket-operations/

Shared helpers extracted from the write path, compaction pipeline, and read path. All new files.

  • batch-write.ts — Write-path helper for flushing bucket data batches, shared by V1 and V3.
  • checksum-aggregation.ts — Document-level checksum aggregation for the compaction pipeline. Uses the pre-computed checksum field on BucketDataDocument for fully-included documents; falls back to iterating ops[] for partially-included ones.
  • chunking.tschunkBucketData() groups ops into documents by a 1MB data-size threshold. Single oversized ops get their own chunk. Used by both the write path and compaction rechunking.
  • compaction-scaffolding.ts — Compaction utilities: loading all ops in a bucket, deduplicating by table/row_id (newest-first), and rebuilding survivor documents.
  • query-builders.ts — Query construction helpers for bucket data reads. Builds the (start, checkpoint] range query using min_op for the upper bound to catch documents that straddle the range boundary.
  • source-record-store-impl.ts — Concrete SourceRecordStore implementation using shared collection accessors.

modules/module-mongodb-storage/src/storage/implementation/collection-access/

  • versioned-collections.ts — Shared collection accessor interface and factory for VersionedPowerSyncMongo. Provides typed access to bucket data, source records, parameter indexes, and source tables.

modules/module-mongodb-storage/src/storage/implementation/common/

Shared types and base classes used across V1 and V3.

  • models.ts — Shared model types: CurrentBucket, RecordedLookup, CurrentDataDocument, BucketParameterDocument, SourceTableDocument, BucketStateDocument.
  • PersistedBatchShared.ts — Shared batch persistence logic for flushing bucket data documents via serializeBucketData().
  • BucketDataDoc.ts, PersistedBatch.ts, SingleBucketStore.ts, VersionedPowerSyncMongoBase.ts — Minor import and type updates.

modules/module-mongodb-storage/src/storage/implementation/document-formats/

The chunked multi-op document format.

  • bucket-document-format.ts — Core format definition. BucketDataDocument stores an ops[] array with aggregated metadata (_id.o, min_op, count, checksum, size, target_op). serializeBucketData() groups ops and computes aggregates. buildBucketDataQuery() constructs range queries with min_op upper bound. extractRowsFromDocument() post-filters individual ops within partially overlapping documents.
  • parameter-lookup.ts — Serialization/deserialization for parameter lookup values stored in bucket documents.

modules/module-mongodb-storage/src/storage/implementation/v1/

V1 (single-op document format) is structurally updated to inline shared logic but has no functional changes.

  • MongoBucketBatchV1.ts — Inlined shared write-path logic. Single-op document format preserved.
  • MongoSyncBucketStorageV1.ts — Inlined shared storage operations.
  • MongoCompactorV1.ts, MongoChecksumsV1.ts, MongoParameterCompactorV1.ts, PersistedBatchV1.ts, SingleBucketStoreV1.ts, VersionedPowerSyncMongoV1.ts, models.ts — Import updates and minor refactoring for shared types.

modules/module-mongodb-storage/src/storage/implementation/v3/

Primary V3 implementation using chunked multi-op documents.

  • MongoSyncBucketStorageV3.ts — Main V3 storage class. Implements all operations directly: read path uses buildBucketDataQuery() with min_op upper bound and extractRowsFromDocument() for post-filtering; write path delegates to MongoBucketBatchV3.
  • MongoBucketBatchV3.ts — V3 write-path batch. Handles multi-op document serialization and chunked writes via shared bucket-operations/ helpers.
  • MongoCompactorV3.ts — Range-merging compaction: reads all ops, deduplicates by table/row_id (newest-first), rechunks survivors by 1MB threshold, replaces old documents in a transaction.
  • MongoChecksumsV3.ts — Document-level checksum aggregation. Fully-included documents use the pre-computed checksum field; partially-included documents iterate ops[].
  • PersistedBatchV3.ts — Thin wrapper delegating to shared PersistedBatchShared.
  • SingleBucketStoreV3.ts — Uses shared document format and generator-based load function for iterating ops within multi-op documents.
  • SourceRecordStoreV3.ts — Uses shared SourceRecordStoreImpl.
  • VersionedPowerSyncMongoV3.ts — Extends shared VersionedPowerSyncMongo directly.
  • models.ts — Re-exports from common/models.ts with V3-specific types kept locally.

Deleted:

  • MongoParameterCompactorV3.ts — Consolidated into shared MongoParameterCompactor.
  • MongoParameterLookupV3.ts — Consolidated into shared document-formats/parameter-lookup.ts.

modules/module-mongodb-storage/src/utils/

  • util.ts — Added utility export for shared storage code.

modules/module-mongodb-storage/test/src/

  • storage_compacting.test.ts — 8 compaction boundary tests (deduplication across document boundaries, empty buckets, single survivors, cross-doc row_ids, rechunking) and 22 invariant/edge-case tests (ops[] ordering, range metadata consistency, target_op correctness, non-overlapping ranges, BSON limit safety, serialization fidelity, checksum consistency, maxOpId filtering).
  • storage_sync.test.ts — 13 read filtering boundary tests exercising (start, checkpoint] semantics with pre-inserted documents. Existing V3 tests updated to use shared types.
  • storage.test.ts — Added compressedBucketStorage flag to V3 test config.
  • __snapshots__/storage.test.ts.snap — New snapshots for V3 storage initialization.
  • __snapshots__/storage_sync.test.ts.snap — Expanded snapshots reflecting multi-op document format.

modules/module-postgres-storage/test/src/

  • storage.test.ts — Added compressedBucketStorage: false to test config.
  • storage_sync.test.ts — Added compressedBucketStorage flag to shared test registration.

packages/service-core-tests/src/tests/

  • register-data-storage-data-tests.ts — Shared data storage tests updated with compressedBucketStorage flag for conditional assertions on multi-op vs single-op document shapes.
  • register-sync-tests.ts — Shared sync test registration with compressedBucketStorage flag for document format assertions.

packages/service-core/src/storage/

  • BucketStorageFactory.ts — Added compressedBucketStorage boolean to TestStorageConfig. Controls whether shared tests assert multi-op document shapes.
Area Files
Factory & routing createMongoSyncBucketStorage.ts, db.ts
MongoDB storage implementation v3/MongoSyncBucketStorageV3.ts, v3/MongoCompactorV3.ts, v3/MongoChecksumsV3.ts, v3/PersistedBatchV3.ts, v3/MongoBucketBatchV3.ts
Shared helpers bucket-operations/chunking.ts, bucket-operations/batch-write.ts, bucket-operations/checksum-aggregation.ts, bucket-operations/compaction-scaffolding.ts, bucket-operations/query-builders.ts
Document format document-formats/bucket-document-format.ts, document-formats/parameter-lookup.ts
Models & types common/models.ts, common/BucketDataDoc.ts
Base classes AbstractMongoSyncBucketStorage.ts, MongoSyncBucketStorage.ts
Tests test/src/storage_sync.test.ts, test/src/storage_compacting.test.ts
Changeset .changeset/wild-pears-sing.md

Follow-up Work

  • Benchmark and tune 1MB chunk threshold under production workloads
  • Extract 1MB magic number to shared constant if tuning proves necessary
  • Monitor for edge cases not covered by the test matrix

Renames all class, function, type, and collection accessor names in
the duplicated v5 storage implementation from V3→V5:
- MongoBucketBatchV3 → MongoBucketBatchV5
- MongoChecksumsV3 → MongoChecksumsV5
- MongoCompactorV3 → MongoCompactorV5
- MongoParameterCompactorV3 → MongoParameterCompactorV5
- MongoParameterLookupV3 → MongoParameterLookupV5
- MongoSyncBucketStorageV3 → MongoSyncBucketStorageV5
- PersistedBatchV3 → PersistedBatchV5
- SingleBucketStoreV3 → SingleBucketStoreV5
- SourceRecordStoreV3 → SourceRecordStoreV5
- VersionedPowerSyncMongoV3 → VersionedPowerSyncMongoV5

Also adds compressedBucketStorage to StorageConfig and wires up
MongoSyncBucketStorageV5 selection in createMongoSyncBucketStorage.

This is a pure mechanical rename with no behavior changes.
Change BucketDataDocumentV5 to store arrays of operations per document:
- Add BucketOperationV5 interface with per-op fields including op_id
- Add aggregated fields: min_op, checksum, count, size
- Implement serializeBucketDataV5() to group ops and compute aggregates
- Implement loadBucketDataDocumentV5() as generator yielding from ops array

Add chunking logic in PersistedBatchV5.flushBucketData():
- Group operations by bucket then chunk by 1MB size threshold
- Single-op chunks remain valid for backward compatibility

Update read path in MongoSyncBucketStorageV5 to iterate merged docs.
Update SingleBucketStoreV5 for new generator-based load function.
Overrides compactSingleBucket in MongoCompactorV5 to handle the
compressed bucket storage model:

1. Reads all documents in a bucket sorted by _id.o ascending
2. Loads all ops via loadBucketDataDocumentV5()
3. Filters superseded operations using the same row_id tracking
   logic as v3 (newest-to-oldest pass, keeps only latest PUT/REMOVE
   per row)
4. Re-chunks surviving ops by 1MB data-size threshold
5. Replaces old documents with new chunked docs in a transaction
6. Updates bucket_state with recomputed checksums, counts, and bytes

Unlike v3, v5 does not create MOVE/CLEAR ops during compaction.
Instead, superseded ops are dropped and surviving ops are fully
restructured into new documents.
…egation and activate v5 in test matrix

- Override MongoChecksumsV5.computePartialChecksumsForCollection to use
document-level checksum field instead of expanding ops arrays
- Handle partial ranges correctly by filtering ops when start > min_op
- Fix getBucketDataBatchV5 to respect op-level limits instead of document limits
- Update PowerSyncMongo.versioned to create VersionedPowerSyncMongoV5 for v5
- Add STORAGE_VERSION_5 to SUPPORTED_STORAGE_VERSIONS and STORAGE_VERSION_CONFIG
- Update getMongoStorageConfig to enable compressedBucketStorage for v5
- Fix v3-specific tests to only run on storageVersion == 3
@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Apr 29, 2026

🦋 Changeset detected

Latest commit: 03d8bd6

The changes in this PR will be included in the next version bump.

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@Sleepful Sleepful force-pushed the compressed-bucket-storage branch from f4f82ee to b4d71e3 Compare April 29, 2026 06:02
@Sleepful Sleepful force-pushed the compressed-bucket-storage branch from b4d71e3 to 755fad1 Compare April 29, 2026 06:08
Sleepful added 18 commits May 6, 2026 02:56
…tractMongoSyncBucketStorage and MongoSyncBucketStorageBase → MongoSyncBucketStorage
…ter to MongoParameterCompactor base class

Make collectionFilter() and deleteFilter() concrete in the base class
with the V3/V5 implementation (returns {} and {lookup, _id, key}
respectively). Remove the abstract keyword from the base class.

Delete the now-redundant V3 and V5 parameter compactor subclasses:
- v3/MongoParameterCompactorV3.ts
- v5/MongoParameterCompactorV5.ts

Update MongoSyncBucketStorageV3 and V5 to instantiate MongoParameterCompactor
directly, passing the collection lister callback inline.
…acks interface to separate file

- Create common/MongoSyncBucketStorageCallbacks.ts with the full interface
- Replace inline MongoSyncBucketStorageBaseCallbacks in MongoSyncBucketStorageBase.ts
- Type _versionCallbacks as MongoSyncBucketStorageCallbacks in AbstractMongoSyncBucketStorage
- Update v3 and v5 implementations to import from the new file
- Use 'any' for createCompactor's storage parameter to avoid circular imports
Move getParameterSetsShared, getBucketDataBatchSharedWrapper,
getDataBucketChangesShared, and getParameterBucketChangesShared from
bucket-operations/storage-operations.ts into MongoSyncBucketStorageBase as
private method implementations. Eliminate the context object pattern by
accessing this.callbacks and this.group_id directly. Flatten the
getBucketDataBatchShared -> getBucketDataBatchSharedWrapper chain into a
single getBucketDataBatchImpl method. Delete the now-unused
bucket-operations/storage-operations.ts.
Extract identical types from v3/models.ts and v5/models.ts into a shared
common/models.ts without version suffixes:
- CurrentBucket
- RecordedLookup
- CurrentDataDocument
- BucketParameterDocument
- SourceTableDocument
- BucketStateDocument
- taggedBucketParameterDocumentToTagged

Update v3/models.ts and v5/models.ts to re-export from common/models.ts,
keeping only version-specific exports (BucketDataDocumentV3/V5, etc.).

Update all imports across the codebase to use non-suffixed names from
common/models.ts or version-specific names where appropriate.

Update storage-index.ts to use explicit exports to avoid naming conflicts
with v1/models.ts and models.ts.
@Sleepful Sleepful force-pushed the compressed-bucket-storage branch from 2fa58c2 to 037a81e Compare May 16, 2026 05:09
Comment thread .changeset/wild-pears-sing.md Outdated
@Sleepful Sleepful changed the title feat(mongodb-storage): implement v5 compressed bucket storage (Phase 1) feat(mongodb-storage)!: merge V5 compressed bucket storage into V3 with Phase 1.5 invariant tests May 19, 2026
@Sleepful Sleepful changed the title feat(mongodb-storage)!: merge V5 compressed bucket storage into V3 with Phase 1.5 invariant tests feat(mongodb-storage)!: chunked multi-op bucket documents with range-merging compaction and invariant tests May 19, 2026
Sleepful added 4 commits May 18, 2026 21:15
- Add missing exports (SyncRuleConfigStateV3) to storage-index.ts
- Fix db.ts versioned() to return VersionedPowerSyncMongoV3 for V3
- Fix MongoBucketBatch._db visibility (private -> protected) for subclass access
- Fix SourceRecordStoreV3.ts to use shared serializeParameterLookup
- Fix test file: use VersionedPowerSyncMongoV3, update method names (listSourceRecordCollectionsV3, parameterIndexV3)
- Fix implicit any parameters in MongoPersistedSyncRulesContent and MongoBucketStorage
- Make VersionedPowerSyncMongoV3 extend VersionedPowerSyncMongo for shared generic methods
- Remove unused VersionedPowerSyncMongoClass import	from db.ts

All 343 module-mongodb-storage tests passing.
Copy link
Copy Markdown
Contributor

@rkistner rkistner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving some initial comments, mostly around compacting for now.

I'm still reviewing the other query changes.

Comment on lines +125 to +129
const docs = await this.db
.bucketData(this.group_id, resolvedDefinitionId)
.find({ '_id.b': bucket })
.sort({ '_id.o': 1 })
.toArray();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A single bucket can be way too big to fit into memory: 1-10GB buckets are normal, potentially many millions of operations. Can you keep the "chunking" behavior used previously in queries? I.e. read one batch at a time, and write a chunk as soon as it reached the threshold. I'd also recommend not re-arranging existing documents too much unless the gains are significant enough. Specifically:

  1. Merging multiple small documents into one bigger one always makes sense, as long as it stays below the size thresholds.
  2. Generally avoid splitting up existing documents. For example, say you have documents of (100kb, 1mb, 100kb): In theory you can turn this into (1mb, 200kb), but that reshuffling may not be not worth it.

On the other hand, if there are many individual operations being compacted (turned into MOVE operations), you're re-writing the document anyway, so it might make sense to split it.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True! Thanks for checking this, it's the kind of topic that the tests can't cover. I'm working on the streaming rearchitecture: batched reads with byte-based caps and scoped deletes. Will push for review soon.

try {
await session.withTransaction(
async () => {
await bucketContext.collection.deleteMany({ '_id.b': bucket }, { session });
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There could have been new documents added while compacting, which this would remove. At minimum, this should filter by _id.o. But this section will most likely have to be rewritten to handle writing smaller parts at a time anyway (see comment earlier in this file).

Also note that transactions are great to ensure all these writes happen atomically, but try to limit the amount of work performed in a single transaction. This also relates to the earlier comment.

Comment on lines +156 to +159
// 3. Filter superseded operations using the same row_id logic as v3.
// We iterate newest-to-oldest and keep only the latest PUT/REMOVE per row.
const seen = new Map<string, bigint>();
const surviving = new Array<BucketDataDoc | null>(allOps.length);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we "keep" the latest PUT/REMOVE per row, we do still need to keep tombstone MOVE operations around (same op_id, same checksum, but remove the data), to keep the checksums intact.

See /docs/compacting-operations.md for details. It may be slightly outdated, but the concepts should still be relevant.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, implemented in 4810a78

The V3 compactor now converts superseded PUT/REMOVE ops to MOVE tombstones instead of dropping them - same op_id, same checksum, op: 'MOVE', target_op pointing to the newer op, all data/identity fields stripped (data, table, row_id, source_table, source_key).

This preserves checksum integrity. The bucket-level checksum (sum(doc.checksum) across all documents) is invariant across compaction - every op keeps its original checksum, just with data: null on tombstones.

A few V3-specific details worth noting:

  • Per-op target_op is not stored. serializeBucketData aggregates target_op to the document level (max across all ops) and strips per-op values. The compactor creates per-op target_op on tombstones during dedup, but serialization collapses them. Only the document-level target_op survives. (same detail mentioned in the other comment)

  • Document boundaries change on every compaction pass. chunkBucketData sizes by data bytes. Tombstones contribute 0 bytes, so they pack densely - multiple tombstones and surviving PUTs may end up in the same rechunked document. Individual document checksums change, but the bucket's total checksum is preserved.

  • Op count never decreases. Ops become tombstones but are never deleted. Storage shrinks because tombstones strip the large JSON payloads (typically ~50 bytes vs kilobytes per PUT), but the ops array stays the same length.

  • No CLEAR pass yet. V1 compactor has a CLEAR optimization that collapses leading MOVE/REMOVE sequences. V3 doesn't implement this yet - follow-up work.

Test coverage added in bf33d43b:

  • Checksum preservation across compaction (single doc + multi-doc)
  • Tombstones have data: null and pack densely after rechunking
  • Tombstones and surviving PUTs co-located in same output document

Copy link
Copy Markdown
Collaborator Author

@Sleepful Sleepful May 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh yeah and updated the comment f960baf

row_id: op.row_id,
checksum: op.checksum,
data: op.data,
target_op: null
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably use target_op from the parent doc?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not necessary to inherit target_op from the chunked document into individual decoded ops; this value is not used by any code path. The only path where individual ops need a target_op is compaction's dedup pass, where the compactor sets target_op on MOVE tombstones it creates in-flight (pointing each tombstone to the newer op that superseded it). These values are then aggregated up to the document level by serializeBucketData(). Since dedup always recomputes from scratch, it doesn't need accurate per-op target_op from storage, it only needs the document-level aggregate, which is already available on BucketDataDocument.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right that we don't need it on individual ops, only the aggregate (largest) values per chunk/document-level. But right now I don't see the document-level target_op being used anywhere, or am I missing it?

All I could find is this reference to the row-level target_op, which will always be null:

if (row.target_op != null && (targetOp == null || row.target_op > targetOp)) {
targetOp = row.target_op;
}

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rkistner

But right now I don't see the document-level target_op being used anywhere, or am I missing it?

You are correct.

Even when you reviewed earlier, it was always null because I hadn't written the MOVE tombstones you pointed out in comment #3. Now the compactor sets target_op on MOVE tombstones at op-level:

// MongoCompactorV3.ts
surviving[i] = {
  ...op,
  op: 'MOVE',
  target_op: targetOp,
  // ...fields stripped...
};

serializeBucketData aggregates the max across ops in the chunk to document-level, and strips the per-op values from the stored ops[] array. So we end up with a single target_op per stored document, but no code path currently reads it.

Why store it: for the future CLEAR pass. V3 doesn't have CLEAR yet, but when it does, it'll need target_op at the document level -> same as V1's clearBucket() which tracks the max target_op across MOVE/REMOVE ops to set on the resulting CLEAR op.

Comment on lines +125 to +126
// $sort by _id
{ $sort: { _id: 1 } },
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This $sort doesn't have any effect and can be removed.

In the previous version we used $sort for { $limit: batchLimit }, but that's not being used here anymore.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +13 to +24
return {
_id: {
$gt: {
b: request.bucket,
o: request.start ?? new bson.MinKey()
},
$lte: {
b: request.bucket,
o: request.end
}
}
};
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With documents now representing ranges of operations, this filter is risky.

The $gt part is fine - if any operation in the document is > start, then o > start will also be true.

The $lte can filter too aggressively: If request.end falls in the middle of a chunk, it will exclude the chunk. As discussed in the design document, this query can work if we can guarantee that a checkpoint will never fall in the middle of a chunk, and that this filter is only used with checkpoints for request.end. And while that should be true in most cases, it is tricky to provide a hard guarantee when compacting, which can merge chunks.

Some options here:

  1. Expand the query to include the next document. A query in this form could work:
aggregate([
  { $match: { _id: { $gt: { ...}, $lte: {  b: request.bucket, request.end }},
  {
    $unionWith: {
      coll: ...,
      pipeline: [
        // this matches the next document, which may or may not be part of the requested range
        // the downstream filters on individual operations should filter these out if not part of the requested range
        { $match: { a: { $gt: { b: request.bucket, request.end } } } },
        { $sort: { _id: 1 } },
        { $limit: 1 }
      ]
    }
  }
]);
  1. Completely remove the $lte filter here, and fully rely on the downstream filters to filter out the later operations. This is the simplest approach, and may even be the most performant. This relies on the fact that we're always querying for a recent checkpoint, so there should not be massive amounts of additional data queried here.

Copy link
Copy Markdown
Collaborator Author

@Sleepful Sleepful May 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edit: I misunderstood your comment initially and replied with regards to the MongoSyncBucketStorageV3, but here you are asking about MongoChecksumsV3. I am leaving this comment up because it might be useful, but refer to my next comment for proper reply.


Keen observation, indeed one of the new edge cases!

Phase 1.5 test case 8 in storage_sync.test.ts is exactly this scenario (d80f55bf):

start=30, checkpoint=40 → expected [40]

Document layout:

Doc A: ops [10, 20, 30]  → _id.o=30, min_op=10
Doc B: ops [40, 50, 60]  → _id.o=60, min_op=40
Doc C: ops [70, 80, 90]  → _id.o=90, min_op=70

Checkpoint 40 falls in the middle of Doc B. The query must catch Doc B even though _id.o=60 > 40. With min_op <= 40, Doc B matches (min_op=40 <= 40), then extractRowsFromDocument filters to just [40].

This Test Case had originally failed, and it was fixed in a follow-up commit (037a81e7
).

This commit changed _id.o <= checkpoint to min_op <= checkpoint in bucket-document-format.ts

The invariant min_op <= _id.o holds after compaction too (compaction rechunks via serializeBucketData()@bucket-document-format.ts which always sets min_op to the smallest op), so this fix works regardless of chunk merging.

Does this solve the issue for you?

Copy link
Copy Markdown
Collaborator Author

@Sleepful Sleepful May 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, fixed:

Red test, proves the bug is real: 6d3f47a

Fix: 03d8bd6

Comment on lines +427 to +453
const filters = Array.from(bucketMap.entries()).map(([bucket, start]) => ({
'_id.b': bucket,
'_id.o': { $gt: start }
// MongoDB Filter<T> doesn't accept dotted field paths like '_id.o' in its type.
})) as unknown as lib_mongo.mongo.Filter<BucketDataDocument>[];

const minStart = Array.from(bucketMap.values()).reduce((min, val) => (val < min ? val : min));

const collection = this.db.bucketData<BucketDataDocument>(this.group_id, definitionId);
const formatAdapter = new BucketDocumentFormatAdapter();
// MongoDB Filter<T> doesn't accept the $or operator in its type.
const filter = { $or: filters } as unknown as lib_mongo.mongo.Filter<BucketDataDocument>;
const context = { replicationStreamId: this.group_id, definitionId };
const startOpId = minStart;
const endOpId = end;
const limit = remainingLimit;

const { filter: rangeFilter, cursorOptions } = formatAdapter.buildBucketDataQuery({
startOpId,
endOpId,
remainingLimit: limit
});

const combinedFilter = {
// MongoDB Filter<T> doesn't accept the $and operator in its type.
$and: [filter, rangeFilter]
} as unknown as lib_mongo.mongo.Filter<BucketDataDocument>;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This builds two related filters and $ands them together, but neither of them can use the _id index efficiently in the current form. I'd recommend using a single filter on _id instead, for example:

const filter = {
  $or: Array.from(bucketMap.entries()).map(([bucket, start]) => ({
     _id: {
       $gt: {
          b: bucket,
          o: start
        },
        $lte: {
          b: bucket,
          o: new MaxKey()
        }
      },
      min_op: { $lte: end } // Not sure whether it is better to have this here, or just filter out app-side
  }))
}

Note: There is a lot of overlap in how we're dealing with the query here versus in checksum calculations. One difference here is that we may query for a large range, but don't return the entire range to a client at a time. But I don't think that affects the query filters significantly.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! Keen! This is a new lesson for me on MongoDB query structure. Implemented here, semantically the query grabs the same data:

b231209

Sleepful added 2 commits May 21, 2026 12:27
Resolves 2 content conflicts:
- MongoBucketBatch.ts: accept both hooks and listSourceRecordCollections
- AbstractMongoSyncBucketStorage.ts: combine _db/_checksums naming
  with upstream's storageConfig field

Fixes auto-merge oversight:
- MongoBucketBatchV3.ts: sourceTablesV3 -> sourceTables in
  markSnapshotDone()

# Conflicts:
#   modules/module-mongodb-storage/src/storage/implementation/AbstractMongoSyncBucketStorage.ts
#   modules/module-mongodb-storage/src/storage/implementation/MongoBucketBatch.ts
The $sort between $match and $addFields had no effect on the
pipeline result. Subsequent $addFields, $project, and $group stages
are order-independent, and $group destroys ordering anyway. The final
$sort after $group is kept for deterministic output ordering.

Review feedback: rkistner on PR #617, comment #5.
Sleepful added a commit to Sleepful/powersync-service that referenced this pull request May 21, 2026
Two serialization fidelity tests verifying loadBucketDataDocument()
propagates doc.target_op to individual yielded ops. These should
fail until the implementation fix is applied.

Review feedback: rkistner on PR powersync-ja#617, comment powersync-ja#4.
Sleepful added a commit to Sleepful/powersync-service that referenced this pull request May 21, 2026
Change from hardcoded null to doc.target_op ?? null so downstream
consumers receive the document-level target_op value.

Makes the target_op propagation tests pass.

Review feedback: rkistner on PR powersync-ja#617, comment powersync-ja#4.
@Sleepful Sleepful force-pushed the compressed-bucket-storage branch from 9f2c91e to 6c117a7 Compare May 21, 2026 22:12
Sleepful added 3 commits May 21, 2026 18:40
Merge per-bucket filter and range filter into a single $or with
compound _id range per bucket: { _id: { $gt: {b, o: start}, $lte:
{b, o: MaxKey()} }, min_op: { $lte: end } }. This uses the compound
{b, o} index efficiently — scoped to one bucket from the start instead
of a cross-bucket _id.o scan with separate bucket filtering.

Logically equivalent — all 13 Phase 1.5 read filtering tests pass.

Review feedback: rkistner on PR #617, comment #7.
Previously the V3 compactor dropped superseded ops entirely. This broke
checksum integrity — any client synced before compaction would have a
checksum that includes superseded ops, but the server's checksum no
longer included them.

Now superseded PUT/REMOVE ops are converted to MOVE tombstones: same
op_id, same checksum, op type set to MOVE, target_op pointing to the
newer op, data/identity fields stripped. This preserves the checksum
total for all clients.

Updated tests to assert MOVE tombstones instead of dropped ops.
Updated the sync test to expect MOVE ops in the stream for V3
compacted data. Per-op target_op is not stored in V3 (aggregated to
document level by serializeBucketData).

Review feedback: rkistner on PR #617, comment #3.
Four new tests in 'V3 MOVE tombstone properties' describe block:

1. Checksum preserved across compaction with superseded ops in single doc
   — verifies sum(doc.checksum) before == after when ops become tombstones

2. Checksum preserved across compaction with multiple input documents
   — same invariant across doc boundaries

3. Tombstones have null data and pack densely after rechunking
   — verifies MOVE ops have data:null, surviving PUTs keep data,
     and all ops collapse into a single dense doc since tombstones
     contribute 0 bytes to chunking size

4. Tombstones and survivors end up in same document after rechunking
   — verifies checksum + co-location of MOVE and PUT ops in the same
     output document after rechunking
@Sleepful Sleepful force-pushed the compressed-bucket-storage branch 2 times, most recently from 9f68912 to 5803fdb Compare May 22, 2026 06:28
@Sleepful Sleepful force-pushed the compressed-bucket-storage branch 2 times, most recently from 6a6bcd6 to a5b1580 Compare May 23, 2026 08:00
Sleepful added 2 commits May 23, 2026 02:10
Lower-bound (GREEN):

compacted_state.op_id=30 falls mid-document (min_op=10,
_id.o=60). Pipeline's is_fully_included + $filter correctly sums only
ops above the start boundary.

Upper-bound (RED):

Checkpoint=45 falls between ops in a document with _id.o=60, min_op=40.
createBucketFilter uses _id.o <= 45 which excludes the document entirely,
but the document contains op 40 which should be included in the checksum.

Expected checksum=280 (op 40 only), got 0 (document excluded).
…ries

createBucketFilter used _id.$lte {b, o: end} which excluded multi-op
documents whose _id.o > end but whose min_op <= end. Change upper bound
from _id.o <= end to min_op <= end, and add endpoint filtering to the
checksum aggregation pipeline via bucket_end + $and on $filter
conditions.

buildPartialChecksumPipeline now filters ops by both o > bucket_start
AND o <= bucket_end, matching the lower-bound handling already in place.
is_fully_included updated to check both bounds.
@Sleepful Sleepful force-pushed the compressed-bucket-storage branch from a5b1580 to 03d8bd6 Compare May 23, 2026 08:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants