DDI-VO: Deep Direct-Indirect Visual Odometry

This is the official repository of Geometry-aware attention for robust visual odometry. DDI-VO is a hybrid 6-DoF Visual Odometry architecture that leverages a pure Vision Transformer (ViT) backbone to seamlessly integrate direct (photometric/global) and indirect (feature-based) tracking paradigms. By combining globally consistent representations with robust sparse tracking (SuperPoint + LightGlue), this model achieves strong generalization across diverse motion profiles including autonomous driving, aerial flight, and handheld scenarios.

Installation

You can run DDI-VO either natively using a Python virtual environment or via Docker for guaranteed reproducibility.

Option A: Docker

We provide a Dockerfile that packages all required system dependencies, CUDA drivers, and Python libraries.

Clone the repository with submodules:

git clone --recursive https://github.com/larocs/DDI-VO.git
cd DDI-VO

Build the Docker image:

docker build -t ddi-vo .

Run the container (mounting your local dataset folder):

docker run --gpus all -it -v /path/to/your/local/datasets:/workspace/DDI-VO/datasets ddi-vo /bin/bash

Option B: Local Setup (Conda / Venv)

Clone the repository with submodules:

git clone --recursive https://github.com/larocs/DDI-VO.git
cd DDI-VO

(If you already cloned without submodules, run: git submodule update --init --recursive)

Install dependencies: Ensure you have PyTorch and Torchvision installed properly according to your CUDA version, then run:

pip install -r requirements.txt

Dataset Preparation

To use the default dataloaders (kitti.py, queenscamp.py, tartanair.py), your datasets must be strictly organized in the following hierarchy inside the datasets/ directory:

datasets/
├── kitti/
│   ├── sequences/
│   │   ├── 00/
│   │   │   ├── image_2/
│   │   │   └── calib.txt
│   │   └── 01/
│   └── poses/
│       ├── 00.txt
│       └── 01.txt
├── queenscamp/
│   ├── rgb_camera_info.txt
│   └── sequences/
│       ├── 01/
│       │   ├── images/
│       │   └── traj.txt
│       └── 02/
└── tartanair/
    ├── rgb_camera_info.txt
    └── abandonedfactory/
        ├── Easy/
        │   ├── P000/
        │   │   ├── image_left/
        │   │   └── pose_left.txt
        │   └── P001/
        └── Hard/

Model Weights

DDI-VO requires pre-trained weights to run, which can be obtained as follows:

chmod +x download_weights.sh
./download_weights.sh

Usage

Training

To train or fine-tune the model on the supported datasets, configure your parameters in configs/train_example.yaml and run:

python train.py checkpoints/ddi_vo_experiment \
    --conf configs/train_example.yaml \
    --use_cuda

Testing / Inference

To run inference and generate trajectory files (traj.txt) for evaluation against ground truth, run:

python test.py \
    --dataset_config configs/ddi_vo.yaml \
    --model_config configs/ddi_vo_model.yaml \
    --model_path checkpoints/ddi_vo_experiment/best_model.tar \
    --output_path results \
    --trajectory_file traj.txt

Citation

If you use this work, please cite our paper:

@article{Bruno2026,
  author  = {Bruno, Hudson and Cabral, Kleber and Givigi, Sidney and Colombini, Esther},
  title   = {Geometry-aware attention for robust visual odometry},
  journal = {SSRN Electronic Journal},
  doi     = {10.2139/ssrn.6732475},
  year = 2026,
  url     = {https://ssrn.com/abstract=6732475}
}

Other publications

Check out our deep homography estimation visual transformer, which is available here.

Name		Name	Last commit message	Last commit date
Latest commit History 211 Commits
TimeSformer @ 9468513		TimeSformer @ 9468513
configs		configs
glue-factory @ 63fec3d		glue-factory @ 63fec3d
modvo @ c20483f		modvo @ c20483f
.gitmodules		.gitmodules
Dockerfile		Dockerfile
README.md		README.md
download_weights.sh		download_weights.sh
iterators.py		iterators.py
kitti.py		kitti.py
loss.py		loss.py
model.py		model.py
queenscamp.py		queenscamp.py
requirements.txt		requirements.txt
tartanair.py		tartanair.py
test.py		test.py
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DDI-VO: Deep Direct-Indirect Visual Odometry

Installation

Option A: Docker

Option B: Local Setup (Conda / Venv)

Dataset Preparation

Model Weights

Usage

Training

Testing / Inference

Citation

Other publications

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

DDI-VO: Deep Direct-Indirect Visual Odometry

Installation

Option A: Docker

Option B: Local Setup (Conda / Venv)

Dataset Preparation

Model Weights

Usage

Training

Testing / Inference

Citation

Other publications

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages