SRVL: Sample-Efficient and Robust Visual Reinforcement Learning for Autonomous Spacecraft Proximity Operations
This repository contains the code for SRVL, developed for autonomous spacecraft proximity operations from visual observations. The project focuses on sample-efficient and robust visual reinforcement learning for a free-floating space robot that must approach a target in a simulated on-orbit scenario.
The repository includes:
- the
SRVLagent implementation in agents/srvl.py - a custom visual space robot environment in dmc_spacerobot.py and SpaceRobotEnv
- training entrypoints for the space robot task and other visual RL baselines
- vendored dependencies for
MetaWorldandAdroitexperiments inherited from the broader codebase
The main space robot setup uses:
- image observations with
84 x 84rendering - a free-floating chaser spacecraft with a 6-DoF robotic arm
- a target spacecraft capture task defined in
spacerobot_move - Hydra-based experiment configuration under cfgs
The default SRVL configuration is defined in:
Key files and directories:
- train_dmc_spacerobot.py: training entrypoint for the space robot task
- agents: SRVL, DrM, DrQ-v2, and related agent implementations
- SpaceRobotEnv: MuJoCo XML assets and environment code for the spacecraft proximity operation task
- cfgs: Hydra configs for tasks and agents
- metaworld: vendored MetaWorld dependency
- rrl-dependencies: vendored Adroit / RRL dependencies
Create the conda environment and install the Python dependencies:
conda env create -f conda_env.yml
conda activate drm
pip install torch==1.12.1+cu116 torchvision==0.13.1+cu116 --extra-index-url https://download.pytorch.org/whl/cu116For MuJoCo-based tasks, the repository assumes:
mujoco==2.3.0mujoco-py==2.1.2.14- EGL rendering via
MUJOCO_GL=egl
On Ubuntu, you will typically also need:
sudo apt update
sudo apt install libosmesa6-dev libegl1-mesa libgl1-mesa-glx libglfw3If you plan to run the additional MetaWorld or Adroit experiments, install their local dependencies:
cd metaworld
pip install -e .
cd ../rrl-dependencies
pip install -e .
cd mj_envs
pip install -e .
cd ../mjrl
pip install -e .The main entrypoint for the paper-related task in this repository is:
python train_dmc_spacerobot.pyThis uses the Hydra config in cfgs/config_spacerobot.yaml, whose defaults already select:
- task:
spacerobot_move - agent:
srvl - training horizon:
1000000frames - frame stack:
3 - action repeat:
2
You can override Hydra parameters from the command line. For example:
python train_dmc_spacerobot.py seed=123 use_wandb=false save_video=falseTraining outputs are written under:
exp_local/<date>/<time>_<hydra-overrides>/Depending on configuration, that directory may contain:
- TensorBoard logs
- W&B logs
- saved videos
- replay buffer data
snapshot.pt
This repository also keeps the broader visual RL codebase used for comparison and ablation:
python train_dmc.py task=dog_walk agent=drmpython train_mw.py task=coffee-push agent=drm_mwpython train_adroit.py task=pen agent=drm_adroit
These scripts are part of the retained upstream training framework and are not the main entrypoints for the spacecraft proximity operation task.
- The current repository tracks code and assets only. Large experiment outputs, checkpoints, W&B logs, and TensorBoard event files are excluded by
.gitignore. - The W&B project name used by the space robot training script is currently
dmc_spacerobot. - The environment XML for the main task is loaded from SpaceRobotEnv/assets/spacerobot/spacerobot.xml.
If you use this repository in your research, please cite:
@article{zhang2026srvl,
title={Sample-Efficient and Robust Visual Reinforcement Learning for Autonomous Spacecraft Proximity Operations},
author={Zhang, Jin and Yao, Yufan and Liu, Xing and Liu, Zhengxiong and Huang, Panfeng},
journal={IEEE Transactions on Aerospace and Electronic Systems},
volume={62},
pages={7520--7532},
year={2026},
doi={10.1109/TAES.2026.3671771}
}This codebase builds on a broader visual RL framework and includes components derived from prior open-source projects, including:
The current repository also retains vendored task and dependency code for MetaWorld and Adroit/RRL experiments used by the broader training framework.
