From Sparse to Dense: Camera Relocalization with Scene-Specific Detector from Feature Gaussian Splatting
Zhiwei Huang1,2
Hailin Yu2†
Yichun Shentu2
Jin Yuan2
Guofeng Zhang1†
1State Key Lab of CAD & CG, Zhejiang University 2SenseTime Research
† Corresponding Authors
CVPR2025
The code in this repository has a better performance than our paper, through some adjustments:
- Set
align_corners=False
in interpolation. - Use a smaller learning rate for ourdoor dataset.
- Use the anti-aliasing feature of gsplat.
Method | Chess | Fire | Heads | Office | Pumpkin | RedKitchen | Stairs | Avg.↓[cm/◦] |
---|---|---|---|---|---|---|---|---|
STDLoc (paper) | 0.46/0.15 | 0.57/0.24 | 0.45/0.26 | 0.86/0.24 | 0.93/0.21 | 0.63/0.19 | 1.42/0.41 | 0.76/0.24 |
STDLoc (repo) | 0.43/0.13 | 0.49/0.20 | 0.41/0.24 | 0.72/0.21 | 0.91/0.23 | 0.59/0.14 | 1.19/0.36 | 0.67/0.22 |
Method | Court | King’s | Hospital | Shop | St. Mary’s | Avg.↓[cm/◦] |
---|---|---|---|---|---|---|
STDLoc (paper) | 15.7/0.06 | 15.0/0.17 | 11.9/0.21 | 3.0/0.13 | 4.7/0.14 | 10.1/0.14 |
STDLoc (repo) | 11.5/0.06 | 14.7/0.15 | 11.3/0.21 | 2.5/0.12 | 3.5/0.12 | 8.7/0.13 |
- Clone this repository.
git clone --recursive https://github.com/zju3dv/STDLoc.git
- Install packages
conda create -n stdloc python=3.8 -y
pip install torch==2.4.1 torchvision==0.19.1 torchaudio==2.4.1 --index-url https://download.pytorch.org/whl/cu124
pip install -r requirements.txt
# install submodules
pip install submodules/simple-knn
pip install submodules/gsplat
We use two public datasets:
- Download images following HLoc.
export dataset=datasets/7scenes
for scene in chess fire heads office pumpkin redkitchen stairs; \
do wget http://download.microsoft.com/download/2/8/5/28564B23-0828-408F-8631-23B1EFF1DAC8/$scene.zip -P $dataset \
&& unzip $dataset/$scene.zip -d $dataset && unzip $dataset/$scene/'*.zip' -d $dataset/$scene; done
- Download full reconstructions from visloc_pseudo_gt_limitations:
pip install gdown
gdown 1ATijcGCgK84NKB4Mho4_T-P7x8LSL80m -O $dataset/7scenes_reference_models.zip
unzip $dataset/7scenes_reference_models.zip -d $dataset
# move sfm_gt to each dataset
for scene in chess fire heads office pumpkin redkitchen stairs; \
do mkdir -p $dataset/$scene/sparse && cp -r $dataset/7scenes_reference_models/$scene/sfm_gt $dataset/$scene/sparse/0 ; done
- Download images from PoseNet's project page:
export dataset=datasets/cambridge
export scenes=( "KingsCollege" "OldHospital" "StMarysChurch" "ShopFacade" "GreatCourt" )
export IDs=( "251342" "251340" "251294" "251336" "251291" )
for i in "${!scenes[@]}"; do
wget https://www.repository.cam.ac.uk/bitstream/handle/1810/${IDs[i]}/${scenes[i]}.zip -P $dataset \
&& unzip $dataset/${scenes[i]}.zip -d $dataset ; done
- Install Mask2Former:
cd submodules/Mask2Former
pip install -r requirements.txt
python -m pip install 'git+https://github.com/facebookresearch/detectron2.git'
cd mask2former/modeling/pixel_decoder/ops
sh make.sh
cd ../../../..
# download model
wget https://dl.fbaipublicfiles.com/maskformer/mask2former/coco/panoptic/maskformer2_swin_large_IN21k_384_bs16_100ep/model_final_f07440.pkl
cd ../..
- Preprocess data:
bash scripts/dataset_preprocess.sh
# For 7-Scenes:
bash scripts/train_7scenes.sh
# For Cambridge Landmarks
bash scripts/train_cambridge.sh
We also provide pretrained models for 7-Scenes and Cambridge Landmarks datasets here.
gdown 1gxmmpYD-XjYT01cu0flfNHf0CuJDIXsh
gdown 1EbKx9NY2cgtIxkQQ7Spjpl90PG68GKgu
unzip map_7scenes.zip
unzip map_cambridge.zip
Reproduce the experimental results.
# For 7-Scenes:
bash scripts/evaluate_7scenes.sh
# For Cambridge Landmarks
bash scripts/evaluate_cambridge.sh
@inproceedings{huang2025stdloc,
title={From Sparse to Dense: Camera Relocalization with Scene-Specific Detector from {Feature Gaussian Splatting}},
author={Huang, Zhiwei and Yu, Hailin and Shentu, Yichun and Yuan, Jin and Zhang, Guofeng},
booktitle={CVPR},
pages={},
year={2025}
}
- Feature 3DGS: Our codebase is built upon Feature 3DGS.
- gsplat: We use gsplat as our rasterization backend.