Skip to content

TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow Models

License

Notifications You must be signed in to change notification settings

hquu/TripoSG

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow Models

Project Page Paper Model Online Demo

By Tripo

teaser

TripoSG is an advanced high-fidelity, high-quality and high-generalizability image-to-3D generation foundation model. It leverages large-scale rectified flow transformers, hybrid supervised training, and a high-quality dataset to achieve state-of-the-art performance in 3D shape generation.

✨ Key Features

  • High-Fidelity Generation: Produces meshes with sharp geometric features, fine surface details, and complex structures
  • Semantic Consistency: Generated shapes accurately reflect input image semantics and appearance
  • Strong Generalization: Handles diverse input styles including photorealistic images, cartoons, and sketches
  • Robust Performance: Creates coherent shapes even for challenging inputs with complex topology

🔬 Technical Highlights

  • Large-Scale Rectified Flow Transformer: Combines RF's linear trajectory modeling with transformer architecture for stable, efficient training
  • Advanced VAE Architecture: Uses Signed Distance Functions (SDFs) with hybrid supervision combining SDF loss, surface normal guidance, and eikonal loss
  • High-Quality Dataset: Trained on 2 million meticulously curated Image-SDF pairs, ensuring superior output quality
  • Efficient Scaling: Implements architecture optimizations for high performance even at smaller model scales

🔥 Updates

  • [2025-03] Release of TripoSG 1.5B parameter rectified flow model and VAE trained on 2048 latent tokens, along with inference code and interactive demo

🔨 Installation

Clone the repo:

git clone https://github.com/VAST-AI-Research/TripoSG.git
cd TripoSG

Create a conda environment (optional):

conda create -n tripoSG python=3.10
conda activate tripoSG

Install dependencies:

# pytorch (select correct CUDA version)
pip install torch torchvision --index-url https://download.pytorch.org/whl/{your-cuda-version}

# other dependencies
pip install -r requirements.txt

💡 Quick Start

Generate a 3D mesh from an image:

python -m scripts.inference_triposg --image-input assets/example_data/hjswed.png

The required model weights will be automatically downloaded:

💻 System Requirements

  • CUDA-enabled GPU with at least 8GB VRAM

📝 Tips

  • If you want to use the full VAE module (including the encoder part), you need to uncomment the Line-15 in triposg/models/autoencoders/autoencoder_kl_triposg.py and install torch-cluster. and run:
python -m scripts.inference_vae --surface-input assets/example_data_point/surface_point_demo.npy

🤝 Community & Support

  • Interactive Demo: Try TripoSG on Hugging Face Spaces
  • Issues & Discussions: Use GitHub Issues for bug reports and feature requests
  • Contributing: We welcome contributions!

📚 Citation

@article{li2025triposg,
  title={TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow Models},
  author={Li, Yangguang and Zou, Zi-Xin and Liu, Zexiang and Wang, Dehu and Liang, Yuan and Yu, Zhipeng and Liu, Xingchao and Guo, Yuan-Chen and Liang, Ding and Ouyang, Wanli and others},
  journal={arXiv preprint arXiv:2502.06608},
  year={2025}
}

⭐ Acknowledgements

We would like to thank the following open-source projects and research works that made TripoSG possible:

We are grateful to the broader research community for their open exploration and contributions to the field of 3D generation.

About

TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow Models

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%