This repository contains the official implementation of the ECCV 2024 paper:
SlotLifter: Slot-guided Feature Lifting for Learning Object-centric Radiance Fields
YuLiu*,Baoxiong Jia*,Yixin Chen, Siyuan Huang
We provide all environment configurations in requirements.txt
. To install all packages, you can create a conda environment and install the packages as follows:
conda create -n slotlifter python=3.8
conda activate slotlifter
conda install pytorch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 pytorch-cuda=12.1 -c pytorch -c nvidia
pip install -r requirements.txt
In our experiments, we used NVIDIA CUDA 12.1 on Ubuntu 22.04. Similar CUDA version should also be acceptable with corresponding version control for torch
and torchvision
.
CLEVR567, Room-Chair, and Room-Diverse datasets are provided by uORF.
Room-Texture, Kitchen-Matte, and Kitchen-Shiny datasets are provided by uOCF.
Download the ScanNet dataset here and process it with the official codes to obtain images, poses, and intrinsics. We use 100 scenes (from scene0001_00 to scene0101_00 except scene0079_00 which is in the test split) for training. We sample about 400 views and resize each image to a resolution of 640 × 480. We organize the dataset as below:
├──scannet/
├──scene0001_00/
├──color_480640/
├──0.jpg
├──pose/
├──0.txt
├──intrinsic/
├──intrinsic_color.txt
Following previous works, we use the test data provided by NerfingMVS.
Download the DTU dataset provided by PixelNeRF here.
We provide training and testing scripts under scripts/
for all datasets.
train_uorf_data.sh
andeval_uorf_data.sh
: CLEVR567, Room-Chair, Room-Diverse, Room-Texture, Kitchen-Matte, and Kitchen-Shiny datasettrain_scannet.sh
andeval_scannet.sh
: Scannet datasettrain_dtu.sh
andeval_dtu.sh
: DTU dataset
Download pre-trained checkpoints here.
If you find our paper and/or code helpful, please consider citing:
@inproceedings{Liu2024slotlifter,
title={SlotLifter: Slot-guided Feature Lifting for Learning Object-centric Radiance Fields},
author={Liu, Yu and Jia, Baoxiong and Chen, Yixin and Huang, Siyuan},
booktitle={European Conference on Computer Vision (ECCV)},
year={2024}
}
This code heavily used resources from PanopticLifting, BO-QSA, SLATE, OSRT, IBRNet, and uORF. We thank the authors for open-sourcing their awesome projects.