Skip to content

Make file-based SMC storage MPI-safe by guarding save_step on rank 0#50

Draft
Copilot wants to merge 2 commits into
developfrom
copilot/fix-smc-storage-mpi-compatibility
Draft

Make file-based SMC storage MPI-safe by guarding save_step on rank 0#50
Copilot wants to merge 2 commits into
developfrom
copilot/fix-smc-storage-mpi-compatibility

Conversation

Copilot AI commented Jun 16, 2026

Copy link
Copy Markdown

When running with mpirun and a non-parallel kernel (e.g., VectorMCMC), @rank_zero_run_only on SamplerBase._save_step catches an AttributeError (no _rank on the kernel) and falls back to calling storage.save_step() on all ranks simultaneously, causing concurrent HDF5/Pickle writes and crashes. The workaround was an awkward rank-conditional context manager split.

Changes

  • smcpy/utils/mpi_utils.py — adds is_mpi_rank_zero(): checks mpi4py.MPI.COMM_WORLD.Get_rank() == 0, returns True if mpi4py is unavailable. Decouples the rank check from the MCMC kernel's _rank attribute.
  • smcpy/utils/storage.pyHDF5Storage.save_step and PickleStorage.save_step early-return when is_mpi_rank_zero() is False. The storage is now independently MPI-safe regardless of kernel type.
  • Tests — coverage for is_mpi_rank_zero() across no-MPI, rank-0, and rank-N cases; parametrized tests confirming save_step is a no-op on non-zero ranks for both file storage backends.

With this change, the desired single-context API works with MPI:

# Before: required rank-conditional context split
with HDF5Storage(H5_FILE, "a"):          # load state on all cores
    smc = AdaptiveSampler(kernel)
with HDF5Storage(OUT_FILE, "a") if comm.Get_rank() == 0 else nullcontext():
    steps, mll = smc.sample(N_PARTICLES, N_MCMC_SAMPLES, target_ess=0.9)

# After: single context works across all ranks
with HDF5Storage(H5_FILE, "a"):
    smc = AdaptiveSampler(kernel)
    steps, mll = smc.sample(N_PARTICLES, N_MCMC_SAMPLES, target_ess=0.9)

Copilot AI linked an issue Jun 16, 2026 that may be closed by this pull request
Copilot AI changed the title [WIP] Fix SMC storage to be MPI-compatible Make file-based SMC storage MPI-safe by guarding save_step on rank 0 Jun 16, 2026
Copilot AI requested a review from peleser-nasa June 16, 2026 22:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SMC Storage is not MPI-compatible

2 participants