Add remote compaction format compatibility test to check_format_compatible.sh (#14798)#14798
Add remote compaction format compatibility test to check_format_compatible.sh (#14798)#14798hx235 wants to merge 1 commit into
Conversation
|
@hx235 has exported this pull request. If you are a Meta employee, you can view the originating Diff in D106321150. |
|
| Check | Count |
|---|---|
cert-err58-cpp |
4 |
| Total | 4 |
Details
tools/ldb_cmd.cc (4 warning(s))
tools/ldb_cmd.cc:5469:19: warning: initialization of 'kInputFile' with static storage duration may throw an exception that cannot be caught [cert-err58-cpp]
tools/ldb_cmd.cc:5470:19: warning: initialization of 'kResultFile' with static storage duration may throw an exception that cannot be caught [cert-err58-cpp]
tools/ldb_cmd.cc:5537:51: warning: initialization of 'ARG_JOB_DIR' with static storage duration may throw an exception that cannot be caught [cert-err58-cpp]
tools/ldb_cmd.cc:5617:50: warning: initialization of 'ARG_JOB_DIR' with static storage duration may throw an exception that cannot be caught [cert-err58-cpp]
…tible.sh (facebook#14798) Summary: Add cross-version format compatibility testing for remote compaction to `check_format_compatible.sh`. Two new ldb commands (`remote_compaction_primary`, `remote_compaction_worker`) coordinate via local files: primary's `CompactRange()` writes `input.bin` through `Schedule()` and polls for `result.bin` through `Wait()`; worker polls for `input.bin`, calls `OpenAndCompact()`, writes `result.bin`. For each old ref in `db_forward_with_options_refs` (10.4.fb+), the script tests both directions — current primary + old worker and old primary + current worker — to catch serialization incompatibilities. Old refs lacking the commands are skipped gracefully. Differential Revision: D106321150
Codex Code Review - OBSOLETESuperseded by a newer AI review. Expand to see the original review.🟡 Codex Code ReviewAuto-triggered after CI passed — reviewing commit c74dd23 ❌ Codex review failed before producing findings. ℹ️ About this responseGenerated by Codex CLI. Limitations:
Commands:
|
Claude Code Review - OBSOLETESuperseded by a newer AI review. Expand to see the original review.✅ Claude Code ReviewAuto-triggered after CI passed — reviewing commit c74dd23 SummaryWell-structured PR that adds cross-version remote compaction format compatibility testing via two new ldb commands and shell script integration. The design follows existing High-severity findings (1):
Full review (click to expand)Findings🔴 HIGHH1. Worker silently ignores
|
| Context | Applies? | Safe? | Notes |
|---|---|---|---|
| LD_LIBRARY_PATH for old binary | No | N/A | OLD_LDB="./ldb" runs without env wrapper; only CURRENT_LDB uses env |
| DB lock conflict (primary + worker) | Yes | Yes | OpenAndCompact opens as secondary instance; no lock conflict |
| Stale input.bin from previous run | No | N/A | rm -rf "$test_dir" && mkdir -p "$test_dir" cleans before each test |
| Multi-subcompaction race | No | N/A | Default max_subcompactions=1 with num_levels=2 produces exactly one Schedule/Wait cycle |
| SANITY_CHECK mode | Yes | Yes | Guarded by if [ ! "$SANITY_CHECK" ] around the ldb save and test execution |
| SHORT_TEST mode | Yes | Yes | Runs only first ref in db_forward_with_options_refs; old branch may lack command, gets skipped |
| Cross-filesystem rename in AtomicWriteStringToFile | No | N/A | tmp file and target are in the same directory (job_dir) |
| Word splitting of CURRENT_LDB | Yes | Yes | Path /tmp/rocksdb_cs_compat_${USER:-$$}/current_ldb never contains spaces |
| Both directions tested | Yes | Yes | First loop calls run_cs_compat_test twice: current-primary+old-worker AND old-primary+current-worker |
Positive Observations
- The design elegantly reuses the existing
check_format_compatible.shinfrastructure without requiring changes to the build system or test framework. - Testing both serialization directions (current-to-old and old-to-current) is thorough.
- The
grep -q remote_compaction_primarycheck gracefully skips old branches that lack the commands. - The
env LD_LIBRARY_PATH=...pattern for running the saved current binary is a clean solution for dynamic linking. - The test data generation (4 flushes x 20 keys) is sufficient to produce a real L0->L1 compaction that exercises the core CompactionServiceInput/Result serialization paths.
- The PR includes comprehensive manual testing across multiple scenarios (same-version, cross-version, SANITY_CHECK, SHORT_TEST, CI break-compat rehearsal).
ℹ️ About this response
Generated by Claude Code.
Review methodology: claude_md/code_review.md
Limitations:
- Claude may miss context from files not in the diff
- Large PRs may be truncated
- Always apply human judgment to AI suggestions
Commands:
/claude-review [context]— Request a code review/claude-query <question>— Ask about the PR or codebase
…tible.sh (facebook#14798) Summary: Add cross-version format compatibility testing for remote compaction to `check_format_compatible.sh`. Two new ldb commands (`remote_compaction_primary`, `remote_compaction_worker`) coordinate via local files: primary's `CompactRange()` writes `input.bin` through `Schedule()` and polls for `result.bin` through `Wait()`; worker polls for `input.bin`, calls `OpenAndCompact()`, writes `result.bin`. For each old ref in `db_forward_with_options_refs` (10.4.fb+), the script tests both directions — current primary + old worker and old primary + current worker — to catch serialization incompatibilities. Old refs lacking the commands are skipped gracefully. Differential Revision: D106321150
|
…tible.sh (facebook#14798) Summary: Add cross-version format compatibility testing for remote compaction to `check_format_compatible.sh`. Two new ldb commands (`remote_compaction_primary`, `remote_compaction_worker`) coordinate via local files: primary's `CompactRange()` writes `input.bin` through `Schedule()` and polls for `result.bin` through `Wait()`; worker polls for `input.bin`, calls `OpenAndCompact()`, writes `result.bin`. For each old ref in `db_forward_with_options_refs` (10.4.fb+), the script tests both directions — current primary + old worker and old primary + current worker — to catch serialization incompatibilities. Old refs lacking the commands are skipped gracefully. Differential Revision: D106321150
🟡 Codex Code ReviewAuto-triggered after CI passed — reviewing commit 7b5018d ❌ Codex review failed before producing findings. ℹ️ About this responseGenerated by Codex CLI. Limitations:
Commands:
|
✅ Claude Code ReviewAuto-triggered after CI passed — reviewing commit 7b5018d SummaryWell-structured PR that adds cross-version format compatibility testing for remote compaction serialization (CompactionServiceInput/Result). The design is sound -- using actual CompactRange + OpenAndCompact exercises the real serialization path rather than synthetic round-trips, which is the right choice for a format compatibility test. The file-based coordination protocol is simple and effective. High-severity findings (1):
Full review (click to expand)Findings🔴 HIGHH1. Orphaned worker processes on script interruption --
|
| Context | Executes? | Safe? | Notes |
|---|---|---|---|
| SANITY_CHECK mode | No | Yes | Guarded |
| SHORT_TEST mode | Yes (first ref) | Yes | Works correctly |
| Old refs without commands | No | Yes | grep -q skips gracefully |
| Default max_subcompactions=1 | Yes | Yes | Single Schedule() call |
Positive Observations
- Atomic write via
tmp+renameprevents partial reads - Bidirectional testing covers both serialization directions
- Graceful skip for old refs lacking the commands
- Consistent error handling following LDBCommand patterns
num_levels=2cleanly constrains to single compaction job
ℹ️ About this response
Generated by Claude Code.
Review methodology: claude_md/code_review.md
Limitations:
- Claude may miss context from files not in the diff
- Large PRs may be truncated
- Always apply human judgment to AI suggestions
Commands:
/claude-review [context]— Request a code review/claude-query <question>— Ask about the PR or codebase
Summary:
Add cross-version format compatibility testing for remote compaction to
check_format_compatible.sh.Two new ldb commands (
remote_compaction_primary,remote_compaction_worker) coordinate via local files: primary'sCompactRange()writesinput.binthroughSchedule()and polls forresult.binthroughWait(); worker polls forinput.bin, callsOpenAndCompact(), writesresult.bin. For each old ref indb_forward_with_options_refs(10.4.fb+), the script tests both directions — current primary + old worker and old primary + current worker — to catch serialization incompatibilities. Old refs lacking the commands are skipped gracefully.Test plan:
1. ldb smoke test (same-version E2E) — PASSED:
2. ldb Local cross-version format compatibility — PASSED (expected failure observed):
3. SANITY_CHECK mode — PASSED:
4. SHORT_TEST mode — PASSED:
5. Format compatibility CI check to catch formability breaking change — PASSED (expected failure observed):
Validates the test catches a real format break end-to-end in github CI. Since no released
branch has the new remote compaction ldb commands yet, we simulate the scenario
with two temporary branches:
cs_old_11_1:upstream/11.1.fb+ cherry-picked remote compaction ldb/testinfrastructure (this diff) (NARROW counts, no
kReadTriggered)rc_format_sh_break_compat: this diff + the temporary break-compat changeremoving the
-1counts hack (WIDE counts, haskReadTriggered) + limitingcheck_format_compatible.shto test onlycs_old_11_1for faster CIThe script builds both versions and runs cross-version remote compaction tests.
The current primary serializes WIDE counts; the
cs_old_11_1worker produces aNARROW
CompactionServiceResult. When the current primary tries to deserializethe NARROW result, it fails.
Job: https://github.com/facebook/rocksdb/actions/runs/26569039487/job/78271140887
Observed expected failure:
6. Format compatibility CI check smoke test — PASSED:
Job: https://github.com/facebook/rocksdb/actions/runs/26618984530
The new ldb commands in this PR needed for remote compaction compatibility check are not in any of old branches so this compatibility check will be skipped.