Skip to content

Bobby202608/Depth-Estimation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Depth-Estimation

基于Depth Anything的图像单目深度估计模型 Monocular Depth Estimation Based on Depth Anything

I. 项目介绍 Project Introduction

本项目用于对视频帧进行单目深度估计。该项目:

This project is used for Monocular Depth Estimation of video frames. It:

1. 包含视频抽帧模块,可自动将视频文件(.mp4, .avi, .mov等类型)转换为图像数据集;

Includes a video frame extraction module, automatically converting video files (.mp4, .avi, .mov etc.) into image datasets;

2. 借鉴 Depth Anything 论文思路,采用教师-学生模型架构:

Learns from the idea of Depth Anything paper, using a teacher-student model architecture:

  • 教师模型: 使用 Depth Anything V2,为无标签图像生成伪深度标签,解决真实深度值难以获取的问题;

    Teacher model: Uses Depth Anything V2 to generate pseudo depth labels for unlabeled images, which tackles the difficulty of obtaining real depth values;

  • 学生模型: 包含模型主干模型头两部分。模型主干支持4种类型:ResNet18/ResNet50/MobileNetv4/DINOv2,模型头从教师模型蒸馏

    Student model: Consists of backbone and head. Backbone supports 4 types: ResNet18/ResNet50/MobileNetv4/DINOv2. Head is distilled from teacher model;

3. 支持4种损失函数MSE/MAE/平滑L1/MiDaS风格(尺度平移不变+梯度匹配)2种优化器SGD/AdamW

Supports 4 types of loss function: MSE/MAE/Smooth L1/MiDaS style (scale-and-shift-invariant + gradient matching), and 2 types of optimizer: SGD/AdamW;

4.集中配置数据路径、模型参数、训练超参数等;

Allows centralized configuration of data paths, model parameters, training hyperparameters, etc;

5. 支持2种模型测试模式

Supports 2 modes for model testing:

  • 训练后自动测试; Automatic test after training;
  • 预训练权重评估。 Pre-trained model evaluation.

II. 项目结构 Project Architecture

|——— Depth-Estimation
|   |——— config
|   |   |——— config.json                 # 配置中心 Configuration center
|   |   |——— load_config.py              # 加载配置文件 Load configuration file
|   |   |——— train_sets.txt              # 训练集视频列表 Train set video list
|   |   |——— test_sets.txt               # 测试集视频列表 Test set video list
|   |——— dataset
|   |   |——— data_processing.py          # 数据预处理 Data processing
|   |   |——— test_data_processing.py     # 数据预处理验证 Data processing validation
|   |——— model
|   |   |——— teacher
|   |   |   |——— checkpoints
|   |   |   |   |——— *.pth               # 教师模型权重 Teacher model weights
|   |   |   |——— teacher_run.py          # 运行教师模型 Run teacher model
|   |   |   |——— ...
|   |   |——— backbones                   # 学生模型主干 Student model backbone
|   |   |   |——— resnet18.py
|   |   |   |——— resnet50.py
|   |   |   |——— mobilenetv4.py
|   |   |   |——— dinov2.py
|   |   |   |——— ...
|   |   |——— model.py                    # 学生模型 Student model
|   |   |——— train.py                    # 模型训练 Train the model
|   |   |——— loss.py                     # 损失函数 Loss function
|   |   |——— utils.py                    # 辅助函数 Utility functions
|   |——— input
|   |   |——— videos                      # 视频文件 Videos
|   |   |——— data
|   |   |   |——— ...
|   |   |   |   |——— images              # 视频帧 Video frames
|   |   |   |   |——— depth               # 深度图 Depth map
|   |——— output                          # 输出数据 Output data
|   |——— debug                           # 数据预处理验证输出 Output for data processing validation
|   |——— run.py                          # 运行整个项目 Run the entire project
|   |——— test.py                         # 手动测试模型 Test the model manually

III. 使用指引 Instructions

1. 创建并激活环境 Create and activate environment

conda create -n depth-estimation python=3.10
conda activate depth-estimation

2. 安装依赖 Install dependencies

pip install -r requirements.txt

3. 配置项目 Configure the project

项目配置统一在config/config.json文件中进行。文件内可编辑的各字段含义如下,其余字段请勿修改:

Project configuration is in config/config.json. Meanings for fields ALLOWED to be edited are as below, while the rest fields SHOULD NOT be modified:

{
    "seed": "随机种子 Random seed",

    "data_split": {
        "含义": "视频数据集划分 Dataset split",
        "train": "训练集视频列表 Train set video list",
        "test": "测试集视频列表 Test set video list",
        "train_pairs_path": "训练集图像-深度对输出路径 Train set image-depth pairs generation path",
        "test_pairs_path": "测试集图像-深度对输出路径 Test set image-depth pairs generation path"
    },

    "dataloader": {
        "含义": "数据加载器参数 DataLoader parameters"
    },

    "extract_frame": {
        "含义": "视频抽帧参数 Video frame extraction parameters",
        "video_dir": "视频目录 Video directory",
        "output_dir": "视频帧输出目录 Video frame output directory",
        "output_subdir": "视频帧输出子目录 Video frame output subdirectory",
        "delta": "抽帧间隔 Extraction interval"
    },

    "teacher_model": {
        "含义": "教师模型配置 Teacher model configuration",
        "enable": "是否运行教师模型 Whether to run teacher model",
        "weight_dir": "模型权重目录 Model weight directory",
        "name": "模型名称 Model name,支持 supports vits(默认 default)/vitb/vitl",
        "output_dir": "深度图输出目录 Depth map output directory",
        "pseudo_label_subdir": "深度图输出子目录 Depth map output subdirectory",
        "pred_only": "是否仅生成深度图 Whether to generate depth map only",
        "grayscale": "是否生成灰度图 Whether to generate grayscale image",
        "pair_json_path": "图像-深度对输出路径 Image-depth pairs generation path"
    },

    "backbone": {
        "name": "学生模型主干名称 Student model backbone name,支持 supports ResNet18(默认 default)/ResNet50/MobileNetv4/Dinov2",
    },

    "distillation": {
        "temperature": "模型蒸馏温度 Model distillation temperature"
    },

    "train": {
        "epochs": "训练轮数 Number of training epochs"
    },

    "loss_and_optimizer": {
        "含义": "损失函数与优化器 Loss function and optimizer",
        "loss": {
            "name": "损失函数类型 Loss function type,支持 supports L1(默认 default)/L2(MSE)/smooth_L1/MiDaS",
            "lambda_val": "正则项系数 Regularization coefficient"
        },
        "optimizer": {
            "name": "优化器类型 Optimizer type,支持 supports AdamW(默认 default)/SGD",
            "learning_rate": "学习率 Learning rate",
            "weight_decay": "权重衰减 Weight decay",
            "sgd_momentum": "SGD动量 SGD momentum"
        }
    },

    "test": {
        "含义": "模型测试参数 Model test parameters",
        "automatic": "是否在训练后自动测试模型 Whether to automatically test the model after training",
        "weight": "模型权重路径 Model weight path"
    },

    "debug": {
        "含义": "数据预处理验证参数 Data processing validation parameters",
        "enable": "是否启用数据预处理验证 Whether to enable data processing validation",
        "sample_num": "验证样本数量 Number of samples to validate",
        "output_dir": "验证结果输出目录 Validation result output directory"
    },

    "output": {
        "含义": "输出配置 Output configuration",
        "weights_dir": "模型权重输出目录 Model weight output directory",
        "output_train_dir": "训练集输出数据目录 Train set output data directory",
        "output_test_dir": "测试集输出数据目录 Test set output data directory",
        "pred_only": "是否仅生成深度图 Whether to generate depth map only"
    }
}

4. 下载视频 Download videos

  • 此处提供B站视频下载器:唧唧Down(安装唧唧1即可)

A Bilibili video downloader is provided.

  • 下载网址 Download at: http://client.jijidown.com/

  • 按照软件指引下载B站视频,并将视频放置在input/videos目录下。

    Follow the software's instructions to download Bilibili videos, and place them in input/videos directory.

5. 下载教师模型权重 Download teacher model weights

教师模型权重可在 Depth Anything V2 仓库中下载(URL见文档末尾),放置在model/teacher/checkpoints目录下。

Teacher model weights can be downloaded from Depth Anything V2 repository (URL is at the end of the document), and be placed in model/teacher/checkpoints directory.

6. 运行项目 Run the project

python run.py

IV. 参考论文及仓库 Referenced Papers and Repositories

MiDaS

Depth Anything V1

Depth Anything V2

Releases

No releases published

Packages

 
 
 

Contributors

Languages