Controlling Vision-Language Models for Universal Image Restoration
Deploy DA-CLIP running on Windows.
[2024.05.09] update the daclip-IRS code and instructions.
The current project has not made any adjustments to the original project's restoration model structure or reselected datasets for training and testing. Instead, it has modified and perfected the interface and functionalities to fulfill the undergraduate graduation project design requirements. The following interface code functionalities are provided for understanding and expanding the project at config/daclip-sde/:
-
detect_clip.py
anddetect_daclip.py
are interfaces for degeneration type detection using CLIP and DA-CLIP, respectively. Relevant articles for reference can be found at http://t.csdnimg.cn/U0zLM and http://t.csdnimg.cn/zl2Ei. -
interface_v1.py
andinterface_v2.py
are restoration functionalities developed according to the requirements of the undergraduate thesis. They include automatic detection results and manual selection of the desired degeneration type. The manual selection detection approach replaces the Degradation Embedding from Image-controller with Degradation Embedding from Text-encoder. The difference in v2 is that when the manually selected result matches the maximum value of the automatic detection result, the Degradation Embedding from Image-controller is used to achieve the best restoration effect. -
testsingle.py
provides calculations for PSNR, SSIM, and LPIPS for a single image, referencingtest.py
.
- OS: win11
- nvidia:
- cuda: 12.1
- python 3.9
The project requirements specify Python 3.9 because when exporting related dependencies with pipreqs, it failed to correctly identify packages that were present in the local environment and packages that were imported but not actually used, which were then set to the latest version upon querying PIP. The requirements have been tested and confirmed to be operational on the local machine. The dependencies for this project only include those necessary for running and testing, excluding those related to training.
Python 3.8 is also acceptable; you can use the requirements from the original project, but note that it is not necessary to install all the cudnn11-related dependencies listed. The versions of cudnn, the CUDA toolkit, and pytorch-gpu should be installed based on the individual's local environment setup.
This project has only 16 dependencies compared to the original project's 60 dependencies
Gradio uses the version from the project with a few Chinese and css changes, but it needs dependency support so download another one in the environment.
If anaconda3 is used I advise you first create a virtual environment with:
conda create --name daclip-ISR python=3.9
conda activate daclip-ISR
cd yourprojdir
pip install -r requirements.txt
Install pytorch for your gpu
for me:
pip install torch==2.2.1+cu121 -f https://download.pytorch.org/whl/torch_stable.html
pip install torchvision==0.17.1+cu121 -f https://download.pytorch.org/whl/torch_stable.html
DA-CLIP and Universal-IR are downloaded in pairs, otherwise it does not work as well.
Model Name | Description | GoogleDrive | HuggingFace |
---|---|---|---|
DA-CLIP | 退化感知CLIP模型 | download | download |
Universal-IR | 基于DA-CLIP的通用图像恢复模型 | download | download |
DA-CLIP-mix | 退化感知CLIP模型(添加高斯模糊+面部修复和高斯模糊+ 图像去雨) | download | download |
Universal-IR-mix | 基于DA-CLIP的通用图像恢复模型(添加鲁棒训练和混合退化) | download | download |
To evalute our method on image restoration, please modify the benchmark path and model path.
Here we provide an app.py file for testing your own images. Before that, you need to download the pretrained weights (DA-CLIP and UIR) and modify the model path in
options/test.yml
. Then by simply runningpython app.py
, you can openhttp://localhost:7860
to test the model. (We also provide several images with different degradations in theimages
dir). We also provide more examples from our test dataset in the google drive.
If you have any inquiries, please feel free to reach out to the author of the daclip project at [email protected], or alternatively, you may contact the author of this project at [email protected].
Ifcode helps your research or work, please consider citing the paper. The following are BibTeX references:
@article{luo2023controlling, title={Controlling Vision-Language Models for Universal Image Restoration}, author={Luo, Ziwei and Gustafsson, Fredrik K and Zhao, Zheng and Sj{\"o}lund, Jens and Sch{\"o}n, Thomas B}, journal={arXiv preprint arXiv:2310.01018}, year={2023} }