Skip to content

1st Place on KITTI Depth Completion Leaderboard, Official Code of "[CVPR 2025] Distilling Monocular Foundation Model for Fine-grained Depth Completion"

License

Notifications You must be signed in to change notification settings

Sharpiless/DMD3C

Repository files navigation

DMD³C: Distilling Monocular Foundation Model for Fine-grained Depth Completion

Official Code for the CVPR 2025 Paper
"[CVPR 2025] Distilling Monocular Foundation Model for Fine-grained Depth Completion"

📄 Paper on arXiv


🆕 Update Log

  • [2025.04.23] We have released the 2rd stage training code! 🎉
  • [2025.04.11] We have released the inference code! 🎉

✅ To Do

  • 📦 Easy-to-use data generation pipeline

DMD3C Results

🔍 Overview

DMD³C introduces a novel framework for fine-grained depth completion by distilling knowledge from monocular foundation models. This approach significantly enhances depth estimation accuracy in sparse data, especially in regions without ground-truth supervision.


image


🚀 Getting Started

1. Clone Base Repository

git clone https://github.com/kakaxi314/BP-Net.git

2. Copy This Repo into the BP-Net Directory

cp DMD3C/* BP-Net/
cd BP-Net/DMD3C/

3. Prepare KITTI Raw Data

Download any sequence from the KITTI Raw dataset, which includes:

  • Camera intrinsics
  • Velodyne point cloud
  • Image sequences

Make sure the structure follows the standard KITTI format.

4. Modify the Sequence in demo.py for Inference

Open demo.py and go to line 338, where you can modify the input sequence path according to your downloaded KITTI data.

# demo.py (Line 338)
sequence = "/path/to/your/kitti/sequence"

Download pre-trained weights:

wget https://github.com/Sharpiless/DMD3C/releases/download/pretrain-checkpoints/dmd3c_distillation_depth_anything_v2.pth
mv dmd3c_distillation_depth_anything_v2.pth checkpoints

Run inference:

bash demo.sh

You will get results like this:

supp-video 00_00_00-00_00_30

5. Train on KITTI

Runing monocular depth estimation for all KITTI-raw images. Data structure:

├── datas/kitti/raw/
│   ├── 2011_09_26
│   │   ├── 2011_09_26_drive_0001_sync
│   │   │   ├── image_02
│   │   │   │   ├── data/*.png
│   │   │   │   ├── disp/*.png
│   │   │   ├── image_03
│   │   ├── 2011_09_26_drive_0002_sync.......

Where disparity images are stored in gray-scale.

Download pre-trained checkpoitns:

wget https://github.com/Sharpiless/DMD3C/releases/download/pretrain-checkpoints/pretrained_mixed_singleview_256.pth
mv pretrained_mixed_singleview_256.pth checkpoints

Zero-shot preformance on KITTI valiation set:

Training Data RMSE MAE iRMSE REL
Single-view Images 1.4251 0.3722 0.0056 0.0235

Run metric-finetuning on KITTI dataset:

torchrun --nproc_per_node=4 --master_port 4321 train_distill.py \
    gpus=[0,1,2,3] num_workers=4 name=DMD3D_BP_KITTI \
    ++chpt=checkpoints/pretrained_mixed_singleview_256.pth \
    net=PMP data=KITTI \
    lr=5e-4 train_batch_size=2 test_batch_size=1 \
    sched/lr=NoiseOneCycleCosMo sched.lr.policy.max_momentum=0.90 \
    nepoch=30 test_epoch=25 ++net.sbn=true 

About

1st Place on KITTI Depth Completion Leaderboard, Official Code of "[CVPR 2025] Distilling Monocular Foundation Model for Fine-grained Depth Completion"

Resources

License

Stars

Watchers

Forks

Packages

No packages published