RIDE is an open dataset and benchmark for train delay prediction over the Belgian railway network. It provides a reusable relational data release, model-ready benchmark datasets with shared train/test splits, and a common evaluation protocol for comparing train delay prediction models.
RIDE is organized around four components:
- Silver: a reusable relational dataset over train events, journeys, railway infrastructure, and weather observations.
- Gold: a shared benchmark core with fixed train/test snapshots, prediction instances, target values, and a test evaluation table. Built on this shared core, Gold provides four model-ready datasets for downstream models: tabular, sequential, GNN, and graph-event. The Gold release is available in Lite and Standard tiers.
- Evaluation protocol: using the Gold core, all models are evaluated on the same prediction instances with unified metrics, including MAE, RMSE, and breakdowns by prediction horizon and delay change.
- Benchmark: a comparison of non-learning methods, with a Translation baseline and Graph-event model; statistical learning methods, with XGBoost; and deep learning models, with MLP, LSTM, Transformer, and GNN, using our evaluation protocol.
| Asset | Description |
|---|---|
| Silver | Reusable relational dataset for downstream dataset construction. |
| Gold Lite | Smaller benchmark tier for fast experimentation. |
| Gold Standard | Full benchmark tier used for the main paper results. |
| Path | Description |
|---|---|
src/ |
Reusable Python code for source downloads, dataset construction, benchmark models, and evaluation utilities. |
configs/ |
Dataset pipeline settings, selected benchmark model configurations, and Optuna search spaces. |
manifests/ |
Executable Bronze and Silver table specifications: sources, outputs, transforms, checks, and field metadata. |
scripts/ |
Command-line entry points for data download/build steps, benchmark training/evaluation, hyperparameter search, and figure generation. |
docs/ |
Task-oriented guides for setup, repository structure, dataset download, extension, and paper reproducibility. |
notebooks/ |
Interactive walkthroughs for inspecting Silver, understanding Gold, and running a benchmark training/evaluation flow. |
Get Started
- Install the project
- Understand the repository structure
- Download the released datasets
- Understand the Silver relational dataset
- Understand the Gold benchmark datasets
- Train a model and evaluate it
Extend RIDE
- Modify the download, Bronze, and Silver pipeline
- Create a new Gold-core benchmark tier
- Create variants of existing model-specific Gold datasets
- Create your own model-specific Gold dataset
- Evaluate your model on the benchmark
Reproduce the Paper
Main test-set results on the Gold Standard tier. MAE/RMSE are in seconds;
± is std. over 10 seeds.
| Model | MAE | RMSE |
|---|---|---|
| Translation | 96.65 | 233.42 |
| Graph-event | 88.41 | 232.48 |
| MLP | 77.20 ± 0.04 | 203.21 ± 0.40 |
| XGBoost | 76.58 ± 0.01 | 203.46 ± 0.02 |
| LSTM | 74.62 ± 0.27 | 202.63 ± 0.77 |
| Transformer | 74.54 ± 0.25 | 195.39 ± 0.59 |
| GNN | 73.62 ± 0.19 | 194.56 ± 0.88 |
Preprint: RIDE: An Open Dataset and Benchmark for Train Delay Prediction
@misc{elliker2026rideopendatasetbenchmark,
title={RIDE: An Open Dataset and Benchmark for Train Delay Prediction},
author={Clément Elliker and Mathis Le Bail and Clément Mantoux and Jesse Read and Sonia Vanier},
year={2026},
eprint={2606.05070},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2606.05070},
}The source code in this repository is released under the MIT license; see LICENSE.
The released RIDE datasets are distributed under CC BY 4.0; see DATA_LICENSE.md. RIDE is derived from Infrabel Open Data (CC0) and Open-Meteo API data (CC BY 4.0).
