This repository and its associated Docker images and binaries are provided for evaluation and testing purposes only.
🚫 Commercial use, redistribution, production deployment, or derivative works are strictly prohibited.
If you are interested in commercial licensing, please contact: diwakarjravi@gmail.com
Dockerized implementation of Celesta, which is built upon DABA by Facebook allowing for better load balancing across GPUs
- NVIDIA GPU with compute capability 6.0 or higher
- NVIDIA Driver version 450.80.02 or higher
- Docker version 19.03 or higher
- NVIDIA Container Toolkit (nvidia-docker2)
# Ubuntu/Debian
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update
sudo apt-get install -y nvidia-docker2
sudo systemctl restart dockerdocker run --rm --gpus all nvidia/cuda:11.8.0-base-ubuntu20.04 nvidia-smiLoad the prebuilt docker image with the following command.
docker load -i celesta-docker.tar.gzPlace your BAL dataset files in a directory (e.g., ./data):
mkdir -p data
cd data
wget https://grail.cs.washington.edu/projects/bal/data/ladybug/problem-257-65132-pre.txt.bz2
bzip2 -dk problem-257-65132-pre.txt.bz2
cd .../run-docker.sh --data-dir ./data \
'mpiexec -n 1 mpi_daba_bal_dataset --dataset /data/problem-257-65132-pre.txt --iters 1000 --loss trivial --accelerated true'The Docker image includes the following pre-built binaries:
mpi_daba_bal_dataset- Main DABA algorithm (recommended)mpi_admm_bal_dataset- ADMM-based solvermpi_dr_bal_dataset- Douglas-Rachford solvermem_comm_bal_dataset- Memory/communication profiling versionceres_bal_dataset- Ceres baseline (optional)
./run-docker.sh \
'mpiexec -n 1 mpi_daba_bal_dataset --dataset /data/problem-257-65132-pre.txt --iters 1000 --loss trivial'./run-docker.sh --num-gpus 4 \
'mpiexec -n 4 mpi_daba_bal_dataset --dataset /data/problem-1723-156502-pre.txt --iters 1000 --loss trivial --accelerated true'./run-docker.sh 'mpi_daba_bal_dataset --help'docker run -it --rm --gpus all \
-v $(pwd)/data:/data \
celesta:latest \
/bin/bashdocker save celesta:latest | gzip > celesta-docker.tar.gzOn the target system:
# Load the image
docker load < celesta-docker.tar.gz
# Verify it loaded
docker images | grep celesta
# Run it
docker run --rm --gpus all celesta:latest./run-docker.sh \
'mpiexec -n 2 --bind-to core --map-by socket mpi_daba_bal_dataset --dataset /data/problem.txt --iters 1000'docker run --rm --gpus all \
-v /path/to/datasets:/data \
-v /path/to/results:/results \
celesta:latest \
'mpiexec -n 1 mpi_daba_bal_dataset --dataset /data/problem.txt --iters 1000 --save true'# Use only GPU 0 and 1
docker run --rm --gpus '"device=0,1"' \
-v $(pwd)/data:/data \
celesta:latest \
'mpiexec -n 2 mpi_daba_bal_dataset --dataset /data/problem.txt --iters 1000'Make sure NVIDIA Container Toolkit is installed and Docker daemon has been restarted:
sudo systemctl restart docker
docker run --rm --gpus all nvidia/cuda:11.8.0-base-ubuntu20.04 nvidia-smiEnsure the dataset path is relative to the mounted /data directory inside the container:
# If your dataset is at ./my-data/problem.txt on host
./run-docker.sh --data-dir ./my-data \
'mpiexec -n 1 mpi_daba_bal_dataset --dataset /data/problem.txt --iters 1000'MPI warnings about vader or shared memory are usually harmless in single-node configurations. These are suppressed by default via environment variables.
The container runs as a non-root user (UID 1000). Ensure your data directory has appropriate permissions:
chmod -R 755 ./data- Use local SSD for dataset storage to minimize I/O bottleneck
- Pin to cores: Use MPI binding options for better performance
- Monitor GPU usage: Run
nvidia-smi dmonin another terminal - Batch multiple runs: Process multiple datasets in sequence to amortize container startup
- Docker image size: ~2.5 GB
- Runtime memory: Depends on dataset (typically 4-32 GB per GPU)
- Disk space: Ensure sufficient space for datasets and results
For issues with Celesta algorithms or parameters, reach me at: https://linkedin.com/in/diwakar-ravichandran
For Docker-specific issues, check Docker and NVIDIA Container Toolkit documentation.