CBG-5225: sg-bucket KV range scan implementations#8084
Conversation
There was a problem hiding this comment.
Pull request overview
This pull request adds KV (Key-Value) range scan functionality to Sync Gateway through the sg-bucket abstraction layer, enabling both Couchbase Server (via gocb) and Rosmar to perform efficient range scans over document keys.
Changes:
- Adds
Scan()method implementation forCollectionthat bridges sg-bucket'sRangeScanStoreinterface to gocb's range scan API - Adds
AsRangeScanStore()helper function following the existing pattern for feature-checking datastore capabilities - Implements
RangeScanStoreinterface inLeakyDataStoreto support testing with leaky bucket
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| base/collection_rangescan.go | New file implementing the Scan() method for Collection, converting sg-bucket scan types to gocb scan types, and providing an iterator wrapper for scan results |
| base/collection_rangescan_test.go | Comprehensive test coverage for range scan functionality including full range, partial range, IDs-only, prefix scans, empty ranges, and tombstone exclusion |
| base/leaky_datastore.go | Implements RangeScanStore.Scan() by delegating to underlying datastore, adds interface assertion for RangeScanStore |
| base/collection.go | Adds feature flag BucketStoreFeatureRangeScan with version check (7.6+) |
| base/bucket.go | Adds AsRangeScanStore() helper function to check if a datastore supports range scan operations |
ea06bac to
07ed2f8
Compare
|
|
||
| allDocIDs := []string{"doc_a", "doc_b", "doc_c", "doc_d", "doc_e"} | ||
|
|
||
| // CBS range scan may not immediately reflect recent writes (requires persistence). |
There was a problem hiding this comment.
Is this true? This seems perilous if so. Does this depend of the type of storage used by the backing bucket?
There was a problem hiding this comment.
there's no simple scan consistency setting available to use to guard against this. We could pass in vb snapshots to avoid - but I don't think it's worth it given we're not expecting to one-shot the range scan to migrate all data in one go
it will be an iterative process until no eligible data is remaining to move (i.e. guard against any writes that may be coming in underneath a running range scan)
69ae861 to
639b7a6
Compare
Wrap gocb.Collection.Scan() in the SG collection adapter, converting sgbucket types to gocb types with RawJSONTranscoder. Add AsRangeScanStore helper, LeakyDataStore passthrough, and IsSupported for CBS 7.5+. Includes dual-backend test covering full range, partial range, IDsOnly, sampling, empty range, prefix, and tombstone exclusion.
18021a8 to
089bdba
Compare
…d around with an explicit maximum term in sgbucket.NewRangeScanForPrefix
gregns1
left a comment
There was a problem hiding this comment.
LGTM, will let you update dependencies where and will re-approve
6694dab to
54032d9
Compare
Note: This was coded primarily using Opus 4.6 - with manual guidance/verification/review.
Adds a test that can do a KV range scan on both Couchbase via Gocb and Rosmar via an sg-bucket interface.
Dependencies
Integration Tests