DepthCues: Evaluating Monocular Depth Perception in Large Vision Models
(CVPR 2025)

Duolikun Danier, Mehmet Aygün, Changjian Li, Hakan Bilen, Oisin Mac Aodha

University of Edinburgh

TL;DR: We developed a benchmark, DepthCues, to evaluate human-like monocular depth cues in large vision models, showing their emergence in more recent larger models.

Environment Setup

To install the conda environment, run

conda env create -f environment.yaml

Note: when cloning this repo also remember to clone the submodules.

Downloading Data

DepthCues

Download the (including images and annotations) from here, following the instructions provided on the page. The data folders should look like:

<your data dir>/
└── elevation_v1/
    └── images/
    └── train_data.pkl
    └── val_data.pkl
    └── test_data.pkl
└── lightshadow_v1/     
    └── images/
    └── train_annotations.pkl
    └── val_annotations.pkl
    └── test_annotations.pkl
└── occlusion_v4/    
    └── images_BSDS/
    └── images_COCO/
    └── train_data.pkl
    └── val_data.pkl
    └── test_data.pkl
└── perspective_v1/    
    └── images/
    └── train_val_test_split.json
└── size_v2/
    └── images_indoor/
    └── images_outdoor/
    └── train_data_indoor.pkl
    └── train_data_outdoor.pkl
    └── val_data_indoor.pkl  
    └── val_data_outdoor.pkl
    └── test_data_indoor.pkl
    └── test_data_outdoor.pkl
└── texturegrad_v1/
    └── images/
    └── train_data.pkl
    └── val_data.pkl
    └── test_data.pkl

NYU

Follow the instructions here to download and pre-process the dataset.

DIW

Download the dataset from here, and our annotation files from here.

Probing on DepthCues

Configs

The configs used to probe all 20 vision models on all tasks are provided in configs. Each config specifies

the task (data, loss, metric, probe training parameters),
the vision backbone (model and its data transform)
feature to evaluate (feature type, layer), where layer $\in [0,$ num layers of backbone$)$.
the probe model, whose input dimension depends on backbone and feature type, and output dimension depends on the task.

The configs provided correspond to those used to probe the best layers of the models on each task. For example, this config probes the 11th (which is the final) layer of DINOv2-b14 on the elevation task, using an attentive probe that has a two-dimensional output (corresponding to the two regression targets defining the horizon line).

Note: you need to change the data dirs in the configs.

Model checkpoints

For most of the models, the model backbone classes will download the checkpoints (on the first run) and loads them directly. For DepthAnythingv2, please download the checkpoint (the "base" version) from the official repo and save it to submodules/depth_anything_v2/checkpoints/depth_anything_v2_vitb.pth

Training the probe

To train the probe in this setting, run

python run_probe.py -b configs/elevation/dinov2-b14.yaml -n <run_name> --gpu_id 0

where run_name is an optional name given to this run. This will create a log folder logs/elevation_v1/dinov2-b14/patch_attn/<run_name>/, in which the config will be written, and training results of the probe will be saved. Random seeds can be specified via -s argument. For the paper, we used the random seeds 23, 1, 2, 3, 4.

If you would like to customize the probing config, instead of creating a new config, it's also possible to pass in the configs via command line arguments, and these will overwrite what's specified in the config. For example, to probe the first layer of DINOv2 on elevation using concatenated class and patch tokens, while also specifying a different training data dir, you can run

python python run_probe.py -b configs/elevation/dinov2-b14.yaml -n <run_name> --gpu_id 0 --feature_extractor_config.params.layer=0 --feature_extractor_config.params.feat_type=cls-patch --data_config.train.params.data_path=<path/to/elevation_v1>

Testing the probe

To test the trained probe, run (omit run_name if not specified above).

python run_probe.py -b logs/elevation_v1/dinov2-b14/patch_attn/<run_name>/config.yaml -t --gpu_id 0

Fine-tuning on DepthCues

We also provide configs used for the fine-tuning experiments. To fine-tune DINOv2 on DepthCues, run

python run_finetune.py -b configs/finetune/dinov2_lora/all_cues.yaml --gpu_id 0 -n <run_name>

This will create a log folder logs_finetune/finetune/dinov2lora-b14/all_cues/<run_name>/, where the fine-tuning config and model checkpoints will be saved.

The fine-tuned model can then be probed as described above. We also provide the config files used to probe the fine-tuned models on NYUv2 and DIW depth estimation. For example, if you used the provided fine-tuning config to fine-tune DINOv2, then this config can be used directly to probe the fine-tuned model on NYUv2. Note that you may need to update the checkpoint location if you specified run_name when fine-tuning.

Acknowledgements

This repository is adapted from VPT and Probe3D. We thank the authors for open-sourcing their code.

BibTeX

If you find our work helpful, please consider citing:

@article{danier2024depthcues,
    title = {DepthCues: Evaluating Monocular Depth Perception in Large Vision Models},
    author = {Danier, Duolikun and Aygün, Mehmet and Li, Changjian and Bilen, Hakan and Mac Aodha, Oisin},
    journal = {CVPR},
    year = {2025},
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
configs		configs
data		data
models		models
submodules		submodules
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
environment.yaml		environment.yaml
run_finetune.py		run_finetune.py
run_probe.py		run_probe.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DepthCues: Evaluating Monocular Depth Perception in Large Vision Models
(CVPR 2025)

Duolikun Danier, Mehmet Aygün, Changjian Li, Hakan Bilen, Oisin Mac Aodha

University of Edinburgh

Table of Contents

Environment Setup

Downloading Data

DepthCues

NYU

DIW

Probing on DepthCues

Configs

Model checkpoints

Training the probe

Testing the probe

Fine-tuning on DepthCues

Acknowledgements

BibTeX

About

Releases

Packages

Languages

License

danier97/depthcues

Folders and files

Latest commit

History

Repository files navigation

DepthCues: Evaluating Monocular Depth Perception in Large Vision Models (CVPR 2025)

Duolikun Danier, Mehmet Aygün, Changjian Li, Hakan Bilen, Oisin Mac Aodha University of Edinburgh

Table of Contents

Environment Setup

Downloading Data

DepthCues

NYU

DIW

Probing on DepthCues

Configs

Model checkpoints

Training the probe

Testing the probe

Fine-tuning on DepthCues

Acknowledgements

BibTeX

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

DepthCues: Evaluating Monocular Depth Perception in Large Vision Models
(CVPR 2025)

Duolikun Danier, Mehmet Aygün, Changjian Li, Hakan Bilen, Oisin Mac Aodha

University of Edinburgh

Packages