Depth-Estimation

基于Depth Anything的图像单目深度估计模型 Monocular Depth Estimation Based on Depth Anything

I. 项目介绍 Project Introduction

本项目用于对视频帧进行单目深度估计。该项目：

This project is used for Monocular Depth Estimation of video frames. It:

1. 包含视频抽帧模块，可自动将视频文件（.mp4, .avi, .mov等类型）转换为图像数据集；

Includes a video frame extraction module, automatically converting video files (.mp4, .avi, .mov etc.) into image datasets;

2. 借鉴 Depth Anything 论文思路，采用教师-学生模型架构：

Learns from the idea of Depth Anything paper, using a teacher-student model architecture:

教师模型： 使用 Depth Anything V2，为无标签图像生成伪深度标签，解决真实深度值难以获取的问题；

Teacher model: Uses Depth Anything V2 to generate pseudo depth labels for unlabeled images, which tackles the difficulty of obtaining real depth values;
学生模型： 包含模型主干和模型头两部分。模型主干支持4种类型：ResNet18/ResNet50/MobileNetv4/DINOv2，模型头从教师模型蒸馏；

Student model: Consists of backbone and head. Backbone supports 4 types: ResNet18/ResNet50/MobileNetv4/DINOv2. Head is distilled from teacher model;

3. 支持4种损失函数：MSE/MAE/平滑L1/MiDaS风格（尺度平移不变+梯度匹配）、2种优化器：SGD/AdamW；

Supports 4 types of loss function: MSE/MAE/Smooth L1/MiDaS style (scale-and-shift-invariant + gradient matching), and 2 types of optimizer: SGD/AdamW;

4. 可集中配置数据路径、模型参数、训练超参数等；

Allows centralized configuration of data paths, model parameters, training hyperparameters, etc;

5. 支持2种模型测试模式：

Supports 2 modes for model testing:

训练后自动测试； Automatic test after training;
预训练权重评估。 Pre-trained model evaluation.

II. 项目结构 Project Architecture

|——— Depth-Estimation
|   |——— config
|   |   |——— config.json                 # 配置中心 Configuration center
|   |   |——— load_config.py              # 加载配置文件 Load configuration file
|   |   |——— train_sets.txt              # 训练集视频列表 Train set video list
|   |   |——— test_sets.txt               # 测试集视频列表 Test set video list
|   |——— dataset
|   |   |——— data_processing.py          # 数据预处理 Data processing
|   |   |——— test_data_processing.py     # 数据预处理验证 Data processing validation
|   |——— model
|   |   |——— teacher
|   |   |   |——— checkpoints
|   |   |   |   |——— *.pth               # 教师模型权重 Teacher model weights
|   |   |   |——— teacher_run.py          # 运行教师模型 Run teacher model
|   |   |   |——— ...
|   |   |——— backbones                   # 学生模型主干 Student model backbone
|   |   |   |——— resnet18.py
|   |   |   |——— resnet50.py
|   |   |   |——— mobilenetv4.py
|   |   |   |——— dinov2.py
|   |   |   |——— ...
|   |   |——— model.py                    # 学生模型 Student model
|   |   |——— train.py                    # 模型训练 Train the model
|   |   |——— loss.py                     # 损失函数 Loss function
|   |   |——— utils.py                    # 辅助函数 Utility functions
|   |——— input
|   |   |——— videos                      # 视频文件 Videos
|   |   |——— data
|   |   |   |——— ...
|   |   |   |   |——— images              # 视频帧 Video frames
|   |   |   |   |——— depth               # 深度图 Depth map
|   |——— output                          # 输出数据 Output data
|   |——— debug                           # 数据预处理验证输出 Output for data processing validation
|   |——— run.py                          # 运行整个项目 Run the entire project
|   |——— test.py                         # 手动测试模型 Test the model manually

III. 使用指引 Instructions

1. 创建并激活环境 Create and activate environment

conda create -n depth-estimation python=3.10
conda activate depth-estimation

2. 安装依赖 Install dependencies

pip install -r requirements.txt

3. 配置项目 Configure the project

项目配置统一在config/config.json文件中进行。文件内可编辑的各字段含义如下，其余字段请勿修改：

Project configuration is in config/config.json. Meanings for fields ALLOWED to be edited are as below, while the rest fields SHOULD NOT be modified:

{
    "seed": "随机种子 Random seed",

    "data_split": {
        "含义": "视频数据集划分 Dataset split",
        "train": "训练集视频列表 Train set video list",
        "test": "测试集视频列表 Test set video list",
        "train_pairs_path": "训练集图像-深度对输出路径 Train set image-depth pairs generation path",
        "test_pairs_path": "测试集图像-深度对输出路径 Test set image-depth pairs generation path"
    },

    "dataloader": {
        "含义": "数据加载器参数 DataLoader parameters"
    },

    "extract_frame": {
        "含义": "视频抽帧参数 Video frame extraction parameters",
        "video_dir": "视频目录 Video directory",
        "output_dir": "视频帧输出目录 Video frame output directory",
        "output_subdir": "视频帧输出子目录 Video frame output subdirectory",
        "delta": "抽帧间隔 Extraction interval"
    },

    "teacher_model": {
        "含义": "教师模型配置 Teacher model configuration",
        "enable": "是否运行教师模型 Whether to run teacher model",
        "weight_dir": "模型权重目录 Model weight directory",
        "name": "模型名称 Model name，支持 supports vits(默认 default)/vitb/vitl",
        "output_dir": "深度图输出目录 Depth map output directory",
        "pseudo_label_subdir": "深度图输出子目录 Depth map output subdirectory",
        "pred_only": "是否仅生成深度图 Whether to generate depth map only",
        "grayscale": "是否生成灰度图 Whether to generate grayscale image",
        "pair_json_path": "图像-深度对输出路径 Image-depth pairs generation path"
    },

    "backbone": {
        "name": "学生模型主干名称 Student model backbone name，支持 supports ResNet18(默认 default)/ResNet50/MobileNetv4/Dinov2",
    },

    "distillation": {
        "temperature": "模型蒸馏温度 Model distillation temperature"
    },

    "train": {
        "epochs": "训练轮数 Number of training epochs"
    },

    "loss_and_optimizer": {
        "含义": "损失函数与优化器 Loss function and optimizer",
        "loss": {
            "name": "损失函数类型 Loss function type，支持 supports L1(默认 default)/L2(MSE)/smooth_L1/MiDaS",
            "lambda_val": "正则项系数 Regularization coefficient"
        },
        "optimizer": {
            "name": "优化器类型 Optimizer type，支持 supports AdamW(默认 default)/SGD",
            "learning_rate": "学习率 Learning rate",
            "weight_decay": "权重衰减 Weight decay",
            "sgd_momentum": "SGD动量 SGD momentum"
        }
    },

    "test": {
        "含义": "模型测试参数 Model test parameters",
        "automatic": "是否在训练后自动测试模型 Whether to automatically test the model after training",
        "weight": "模型权重路径 Model weight path"
    },

    "debug": {
        "含义": "数据预处理验证参数 Data processing validation parameters",
        "enable": "是否启用数据预处理验证 Whether to enable data processing validation",
        "sample_num": "验证样本数量 Number of samples to validate",
        "output_dir": "验证结果输出目录 Validation result output directory"
    },

    "output": {
        "含义": "输出配置 Output configuration",
        "weights_dir": "模型权重输出目录 Model weight output directory",
        "output_train_dir": "训练集输出数据目录 Train set output data directory",
        "output_test_dir": "测试集输出数据目录 Test set output data directory",
        "pred_only": "是否仅生成深度图 Whether to generate depth map only"
    }
}

4. 下载视频 Download videos

此处提供B站视频下载器：唧唧Down（安装唧唧1即可）

A Bilibili video downloader is provided.

下载网址 Download at: http://client.jijidown.com/
按照软件指引下载B站视频，并将视频放置在input/videos目录下。

Follow the software's instructions to download Bilibili videos, and place them in input/videos directory.

5. 下载教师模型权重 Download teacher model weights

教师模型权重可在 Depth Anything V2 仓库中下载（URL见文档末尾），放置在model/teacher/checkpoints目录下。

Teacher model weights can be downloaded from Depth Anything V2 repository (URL is at the end of the document), and be placed in model/teacher/checkpoints directory.

6. 运行项目 Run the project

python run.py

IV. 参考论文及仓库 Referenced Papers and Repositories

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Depth-Estimation

基于Depth Anything的图像单目深度估计模型 Monocular Depth Estimation Based on Depth Anything

I. 项目介绍 Project Introduction

II. 项目结构 Project Architecture

III. 使用指引 Instructions

1. 创建并激活环境 Create and activate environment

2. 安装依赖 Install dependencies

3. 配置项目 Configure the project

4. 下载视频 Download videos

5. 下载教师模型权重 Download teacher model weights

6. 运行项目 Run the project

IV. 参考论文及仓库 Referenced Papers and Repositories

MiDaS

Depth Anything V1

Depth Anything V2

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.idea		.idea
config		config
dataset		dataset
model		model
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
run.py		run.py
test.py		test.py

Folders and files

Latest commit

History

Repository files navigation

Depth-Estimation

基于Depth Anything的图像单目深度估计模型 Monocular Depth Estimation Based on Depth Anything

I. 项目介绍 Project Introduction

II. 项目结构 Project Architecture

III. 使用指引 Instructions

1. 创建并激活环境 Create and activate environment

2. 安装依赖 Install dependencies

3. 配置项目 Configure the project

4. 下载视频 Download videos

5. 下载教师模型权重 Download teacher model weights

6. 运行项目 Run the project

IV. 参考论文及仓库 Referenced Papers and Repositories

MiDaS

Depth Anything V1

Depth Anything V2

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages