ADMPO: Any-step Dynamics Model for Policy Optimization

This is the code for the paper "Any-step Dynamics Model Improves Future Predictions for Online and Offline Reinforcement Learning" in ICLR 2025.

Requirements

To install all the required dependencies:

Install MuJoCo engine, which can be downloaded from here.
Install Python packages listed in requirements.txt using pip install -r requirements.txt. You should specify the version of mujoco-py in requirements.txt depending on the version of MuJoCo engine you have installed.
Manually download and install d4rl package from here.
Manually download and install neorl package from here.

Run an experiment

Online Setting

python main4online.py --env-name [Env name]

The config files act as defaults for a task. They are all located in config. --env-name refers to the config files in config/ including Hopper-v3, Walker2d-v3, AntTruncatedObs-v3, and HumanoidTruncatedObs-v3. All results will be stored in the result folder.

For example, run ADMPO-ON on Hopper:

python main4online.py --env-name Hopper-v3

Offline Setting

python main4offline.py --env [Env] --env-name [Env name]

The config files act as defaults for a task. They are all located in config. --env refers to the benchmark, D4RL or NeoRL. --env-name refers to the config files in config/. All results will be stored in the result folder.

For example, run ADMPO-OFF on hopper-medium-v2 dataset of D4RL benchmark:

python main4offline.py --env d4rl --env-name hopper-medium-v2

Citation

If you find this repository useful for your research, please cite:

@inproceedings{
    admpo,
    author       = {Haoxin Lin and
                  Yu{-}Yan Xu and
                  Yihao Sun and
                  Zhilong Zhang and
                  Yi{-}Chen Li and
                  Chengxing Jia and
                  Junyin Ye and
                  Jiaji Zhang and
                  Yang Yu},
    title        = {Any-step Dynamics Model Improves Future Predictions for Online and Offline Reinforcement Learning},
    booktitle    = {The 13th International Conference on Learning Representations (ICLR'25)},
    year         = {2025},
    address      = {Singapore}
}

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
agent		agent
buffer		buffer
components		components
config		config
dynamics		dynamics
env		env
runner		runner
.gitignore		.gitignore
README.md		README.md
main4offline.py		main4offline.py
main4online.py		main4online.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ADMPO: Any-step Dynamics Model for Policy Optimization

Requirements

Run an experiment

Online Setting

Offline Setting

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ADMPO: Any-step Dynamics Model for Policy Optimization

Requirements

Run an experiment

Online Setting

Offline Setting

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages