Adaptive Tensor Parallelism for Large Model Traning and Inference
ATP provides a high-performance implementation of Topology-aware Tensor Parallelism with the following characteristics.
- Two-Level Search Space for Tensor Parallelism.
- Adaptive Tensor Parallelism with Hierarchical Communication Matrix.
- Chunk-based Communication-Computation Overlapping.
- An estimator that helps study the performance of ATP on networks with different topologies.
To install ATP, you will need:
- Python 3.8 or 3.9.
- PyTorch 1.13
- SPMD from pytorch/tau
conda install pytorch torchvision torchaudio pytorch-cuda=11.6 -c pytorch -c nvidia
pip install git+https://github.com/pytorch/tau.git@89700fd