LPViT: Low-Power Semi-structured Pruning for Vision Transformers

Official Pytorch implementation of ECCV'24 paper LPViT: Low-Power Semi-structured Pruning for Vision Transformers (Kaixin Xu*, Zhe Wang*, Chunyun Chen, Xue Geng, Jie Lin, Xulei Yang, Min Wu, Xiaoli Li, Weisi Lin).

*Equal contribution

Installation

Clone this repository

git clone https://github.com/Akimoto-Cris/LPViT.git
cd LPViT

Run nvidia-docker container:

docker run -it --name lpvit --gpus all --shm-sizes=64g -v /path/to/LPViT/:/LPViT -v {/path/to/imagenet/}:/imagenet -w /LPViT nvcr.io/nvidia/pytorch:23.10-py3

Inside the container, pip install timm==0.6.13.
Install nvidia-dali.

Usage

LPViT consists of two stages:

Pruning: perform empirical output distortion sampling on calibration set, solving layerwise sparsity allocation at the given flops target and masking the layer weights.
Finetuning: Finetune the pruned model on original dataset (ImageNet).

Step1: Pruning

DeiT-B FLOPs 50% w/ blocksize 32x32:

python prune_vit.py --model deit_base_distilled_patch16_224 --blocksize 32 32 --data_path /imagenet --amount 0.5 --lambda_power 1 --flop_budget --second_order --smooth_curve

Step2: Finetuning Run bash scripts/taylor_flops5071_base_block32x32_secondorder_power.sh

python -m torch.distributed.launch \
    --nproc_per_node=4 \
    --use_env finetune_vit.py \
    --model deit_base_distilled_patch16_224 \
    --batch-size 128 \
    --data-path /imagenet \
    --init_mask rd_curves/0/sp0.50_deit_base_distilled_patch16_224_ndz_0100_rdcurves_block32x32_ranking_taylor_secondorderapprox_opt_dist_mask_.pt \
    --init_weight rd_curves/0/sp0.50_deit_base_distilled_patch16_224_ndz_0100_rdcurves_block32x32_ranking_taylor_secondorderapprox_opt_dist_mask_.pt \
    --output_dir experiment/taylor1score_4gpus_deit_base_flop5071_block32x32_approx_taylorrank_derivative_secondorder_smooth0.9h4_min_sp0.05_power_distilled  \
    --dist_url 'tcp://127.0.0.1:33251' --distillation-type soft

We provide more pruning and finetuning scripts in scripts folder FYI.

Notes

Right now hardware benchmarking is not available yet, we are trying to support it in the near future.

Results

LPViT achieve SOTA performance in ViTs pruning.

Acknowledgement

We adopt part of the training codes from UVC for finetuning. We appreciate their great works!

Citation

We appreciate it if you would please cite the following paper if you found the implementation useful for your work:

@article{xu2024lpvit,
  title={LPViT: Low-Power Semi-structured Pruning for Vision Transformers},
  author={Xu, Kaixin and Wang, Zhe and Chen, Chunyun and Geng, Xue and Lin, Jie and Yang, Xulei and Wu, Min and Li, Xiaoli and Lin, Weisi},
  journal={arXiv preprint arXiv:2407.02068},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
assets		assets
scripts		scripts
tools		tools
.gitignore		.gitignore
README.md		README.md
blocksparse_attention_module.py		blocksparse_attention_module.py
deit_model.py		deit_model.py
engine.py		engine.py
finetune_vit.py		finetune_vit.py
hubconf.py		hubconf.py
layers.py		layers.py
losses.py		losses.py
models.py		models.py
prune_vit.py		prune_vit.py
pruning_utils.py		pruning_utils.py
samplers.py		samplers.py
sp_vision_transformer.py		sp_vision_transformer.py
utils.py		utils.py
vision_transformer.py		vision_transformer.py
vision_transformer_data.py		vision_transformer_data.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LPViT: Low-Power Semi-structured Pruning for Vision Transformers

Installation

Usage

Notes

Results

Acknowledgement

Citation

About

Releases

Packages

Languages

Akimoto-Cris/LPViT

Folders and files

Latest commit

History

Repository files navigation

LPViT: Low-Power Semi-structured Pruning for Vision Transformers

Installation

Usage

Notes

Results

Acknowledgement

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages