Cascade-Zero123: One Image to Highly Consistent 3D with Self-Prompted Nearby Views (ECCV 2024)

Project Page | Arxiv Paper | Video

Cascade-Zero123 progressively extracts the 3D information from one single image via self-prompted nearby views. View-consistent images can be generated by constructing the structure in a cascade manner.

Cascade-Zero123 can be divided into two parts. The left part is Base-0123, which takes a set of R and T values as input to generate corresponding multi-view images. These output images are concatenated with the input condition image and its corresponding camera pose, forming a self-prompted input denoted as a set of c(xc, ∆R, ∆T) for the right part Refiner-0123.

🦾 Updates

7/2/2024: Accepted by ECCV 2024.
10/16/2023: The rough code has been released, and there may still be some issues. Please feel free to raise issues.

Requirements

Pytorch 2.0 for faster training and inference.

conda create -f environment.yml

or

conda create -n cascade-zero123 python=3.9
conda activate cascade-zero123
pip install -r requirements.txt

Install xformer properly to enable efficient transformers.

conda install xformers -c xformers
# from source
pip install -v -U git+https://github.com/facebookresearch/xformers.git@main#egg=xformers


## Data Preparation
Download Zero123's Objaverse Renderings data:
```commandline
wget https://tri-ml-public.s3.amazonaws.com/datasets/views_release.tar.gz

Configure accelerator by

accelerate config

Training

Launch training:

Follow Original Zero123, fp32, gradient checkpointing, and EMA are turned on.

accelerate launch train_cascade0123.py \
--train_data_dir /data/zero123/views_release \
--pretrained_model_name_or_path lambdalabs/sd-image-variations-diffusers \
--train_batch_size 192 \
--dataloader_num_workers 16 \
--output_dir logs \
--use_ema \
--gradient_checkpointing \
--mixed_precision no

While bf16/fp16 is also supported by running below

accelerate launch train_cascade0123.py \
--train_data_dir /data/zero123/views_release \
--pretrained_model_name_or_path lambdalabs/sd-image-variations-diffusers \
--train_batch_size 192 \
--dataloader_num_workers 16 \
--output_dir logs \
--use_ema \
--gradient_checkpointing \
--mixed_precision bf16

For monitoring training progress, we recommand wandb for its simplicity and powerful features.

wandb login

Acknowledgement

This repository is based on original Zero-1-to-3 and its diffuser implementation zero123-hf. Thanks for their awesome works.

Citation

If you find this work repository/work helpful in your research, welcome to cite the paper and give a ⭐:

@article{Cascadezero123,
  author = {Yabo Chen, Jiemin Fang, Yuyang Huang, Taoran Yi, Xiaopeng Zhang, Lingxi Xie, Xinggang Wang, Wenrui Dai, Hongkai Xiong,and Qi Tian},
  title = {Cascade-Zero123: One Image to Highly Consistent 3D with Self-Prompted Nearby Views},
  year = {2023},
  journal={arXiv preprint arXiv:2312.04424}

On Coming

Scripts of convert diffusers back to zero123 format
Releasing the checkpoints
Novel View Synthesis testing code
Single Image to 3D testing code

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
imgs		imgs
scripts		scripts
README.md		README.md
dataset.py		dataset.py
environment.yml		environment.yml
pipeline_cascade0123.py		pipeline_cascade0123.py
requirements.txt		requirements.txt
train_cascade0123.py		train_cascade0123.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cascade-Zero123: One Image to Highly Consistent 3D with Self-Prompted Nearby Views (ECCV 2024)

Project Page | Arxiv Paper | Video

🦾 Updates

Requirements

Training

Acknowledgement

Citation

On Coming

About

Releases

Packages

Languages

AbrahamYabo/Cascade-Zero123

Folders and files

Latest commit

History

Repository files navigation

Cascade-Zero123: One Image to Highly Consistent 3D with Self-Prompted Nearby Views (ECCV 2024)

Project Page | Arxiv Paper | Video

🦾 Updates

Requirements

Training

Acknowledgement

Citation

On Coming

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages