Skip to content

Commit 5ce5be0

Browse files
Merge pull request #1 from saxenabhishek/dev
merge dev to main
2 parents d67e60f + 9f2553f commit 5ce5be0

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

66 files changed

+3234
-2910
lines changed

.gitignore

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# result files
2-
*.png
2+
results/
33
*.txt
44

55
# ipython notebook files should be manually added after clearning output and kernel
@@ -137,3 +137,6 @@ dmypy.json
137137

138138
# Pyre type checker
139139
.pyre/
140+
141+
testphotos
142+
checkpoints

.vscode/settings.json

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
{
2+
"python.formatting.provider": "black"
3+
}

LICENSE.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
MIT License
2+
3+
Copyright (c) 2021 Abhishek Saxena
4+
5+
Permission is hereby granted, free of charge, to any person obtaining a copy
6+
of this software and associated documentation files (the "Software"), to deal
7+
in the Software without restriction, including without limitation the rights
8+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9+
copies of the Software, and to permit persons to whom the Software is
10+
furnished to do so, subject to the following conditions:
11+
12+
The above copyright notice and this permission notice shall be included in all
13+
copies or substantial portions of the Software.
14+
15+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21+
SOFTWARE.

README.md

Lines changed: 54 additions & 134 deletions
Original file line numberDiff line numberDiff line change
@@ -1,167 +1,87 @@
1-
# Swapping Autoencoder for Deep Image Manipulation
1+
# Swapping Autoencoder Interface
22

3-
[Taesung Park](http://taesung.me/), [Jun-Yan Zhu](https://www.cs.cmu.edu/~junyanz/), [Oliver Wang](http://www.oliverwang.info/), [Jingwan Lu](https://research.adobe.com/person/jingwan-lu/), [Eli Shechtman](https://research.adobe.com/person/eli-shechtman/), [Alexei A. Efros](http://www.eecs.berkeley.edu/~efros/), [Richard Zhang](https://richzhang.github.io/)
4-
5-
UC Berkeley and Adobe Research
6-
7-
![teaser](https://taesung.me/SwappingAutoencoder/index_files/teaser_v3.jpg)
8-
<p float="left">
9-
<img src="https://taesung.me/SwappingAutoencoder/index_files/church_style_swaps.gif" height="190" />
10-
<img src="https://taesung.me/SwappingAutoencoder/index_files/tree_smaller.gif" height="190" />
11-
<img src="https://taesung.me/SwappingAutoencoder/index_files/horseshoe_bend_evensmaller.gif" height="190" />
3+
<p align="center">
4+
<b> An interactive interface for swapping autoencoder</b>
125
</p>
136

14-
### [Project page](https://taesung.me/SwappingAutoencoder/) | [Paper](https://arxiv.org/abs/2007.00653) | [3 Min Video](https://youtu.be/0elW11wRNpg)
15-
16-
17-
## Overview
18-
<img src='imgs/overview.jpg' width="1000px"/>
19-
20-
**Swapping Autoencoder** consists of autoencoding (top) and swapping (bottom) operation.
21-
**Top**: An encoder E embeds an input (Notre-Dame) into two codes. The structure code is a tensor with spatial dimensions; the texture code is a 2048-dimensional vector. Decoding with generator G should produce a realistic image (enforced by discriminator D matching the input (reconstruction loss).
22-
**Bottom**: Decoding with the texture code from a second image (Saint Basil's Cathedral) should look realistic (via D) and match the texture of the image, by training with a patch co-occurrence discriminator Dpatch that enforces the output and reference patches look indistinguishable.
23-
24-
## Installation / Requirements
25-
26-
- CUDA 10.1 or newer is required because it uses a custom CUDA kernel of [StyleGAN2](https://github.com/NVlabs/stylegan2/), ported by [@rosinality](https://github.com/rosinality/stylegan2-pytorch)
27-
- The author used PyTorch 1.7.1 on Python 3.6
28-
- Install dependencies with `pip install dominate torchgeometry func-timeout tqdm matplotlib opencv_python lmdb numpy GPUtil Pillow scikit-learn visdom ninja`
29-
30-
## Testing and Evaluation.
31-
32-
We provide the pretrained models and also several images that reproduce the figures of the paper. Please download and unzip them [here (2.1GB)](http://efrosgans.eecs.berkeley.edu/SwappingAutoencoder/swapping_autoencoder_models_and_test_images.zip). The scripts assume that the checkpoints are at `./checkpoints/`, and the test images at `./testphotos/`, but they can be changed by modifying `--checkpoints_dir` and `--dataroot` options.
33-
34-
### Swapping and Interpolation of the mountain model using sample images
35-
36-
<img src='imgs/interpolation.png' width="1000px"/>
37-
38-
To run simple swapping and interpolation, specify the two input reference images, change `input_structure_image` and `input_texture_image` fields of
39-
`experiments/mountain_pretrained_launcher.py`, and run
40-
```bash
41-
python -m experiments mountain_pretrained test simple_swapping
42-
python -m experiments mountain_pretrained test simple_interpolation
43-
```
44-
45-
The provided script, `opt.tag("simple_swapping")` and `opt.tag("simple_interpolation")` in particular of `experiments/mountain_pretrained_launcher.py`, invokes a terminal command that looks similar to the following one.
46-
47-
```bash
48-
python test.py --evaluation_metrics simple_swapping \
49-
--preprocess scale_shortside --load_size 512 \
50-
--name mountain_pretrained \
51-
--input_structure_image [path_to_sample_image] \
52-
--input_texture_image [path_to_sample_image] \
53-
--texture_mix_alpha 0.0 0.25 0.5 0.75 1.0
54-
```
55-
56-
In other words, feel free to use this command if that feels more straightforward.
7+
<p align="center">
8+
<img src="https://img.shields.io/badge/code%20style-black-000000.svg"/>
9+
<img src="https://img.shields.io/github/issues/saxenabhishek/swapping-autoencoder-pytorch"/>
10+
<img src="https://img.shields.io/badge/license-MIT-blue" alt="license MIT"/>
11+
</p>
5712

58-
The output images are saved at `./results/mountain_pretrained/simpleswapping/`.
13+
## 💡 Project Description
5914

60-
### Texture Swapping
15+
Our project build on the paper [Swapping Autoencoder for Deep Image Manipulation](https://arxiv.org/abs/2007.00653) by [Taesung Park](http://taesung.me/), [Jun-Yan Zhu](https://www.cs.cmu.edu/~junyanz/), [Oliver Wang](http://www.oliverwang.info/), [Jingwan Lu](https://research.adobe.com/person/jingwan-lu/), [Eli Shechtman](https://research.adobe.com/person/eli-shechtman/), [Alexei A. Efros](http://www.eecs.berkeley.edu/~efros/), [Richard Zhang](https://richzhang.github.io/). Our goal with this project was to make it easier for artists to use it as a tool. In that effort, we have introduced 3 interfaces to interact with a pre-trained model and edit images.
6116

62-
<img src='imgs/swapping.jpg' width="1000px"/>
63-
Our Swapping Autoencoder learns to disentangle texture from structure for image editing tasks such as texture swapping. Each row shows the result of combining the structure code of the leftmost image with the texture code of the top image.
17+
## 📺 Preview
6418

65-
To reproduce this image (Figure 4) as well as Figures 9 and 12 of the paper, run
66-
the following command:
67-
```bash
19+
<div align="center">
20+
<img alt="Screenshot" src="imgs/demo2.png" />
21+
</div>
22+
<p float="left">
23+
<img src="imgs/lake-3.jpg.jpg" height="190" />
24+
<img src="imgs/aniket-deole-T-tOgjWZ0fQ-unspl.jpg" height="190" />
25+
<img src="imgs/night-snow.jpg" height="190" />
26+
<img src="imgs/desert.jpeg.jpg" height="190" />
27+
<img src="imgs/lake-2.jpg" height="190" />
28+
</p>
6829

69-
# Reads options from ./experiments/church_pretrained_launcher.py
70-
python -m experiments church_pretrained test swapping_grid
30+
<p align="center">
31+
<h6>Some images we genarated with the streamlit inerface</h6>
32+
</p>
7133

72-
# Reads options from ./experiments/bedroom_pretrained_launcher.py
73-
python -m experiments bedroom_pretrained test swapping_grid
34+
## 📌 Prerequisites
7435

75-
# Reads options from ./experiments/mountain_pretrained_launcher.py
76-
python -m experiments mountain_pretrained test swapping_grid
36+
### 💻 System requirement :
7737

78-
# Reads options from ./experiments/ffhq512_pretrained_launcher.py
79-
python -m experiments ffhq512_pretrained test swapping_grid
80-
```
38+
1. Nvidia GPU with + CUDA.
39+
2. Operating System : Any (Windows / Linux / Mac).
8140

82-
Make sure the `dataroot` and `checkpoints_dir` paths are correctly set in
83-
the respective `./experiments/xx_pretrained_launcher.py` script.
41+
### 💿 Software requirement :
8442

85-
### Quantitative Evaluations
43+
1. python 3.8
44+
2. poetry (Check out poetry [here](https://python-poetry.org/))
8645

87-
To perform quantitative evaluation such as FID in Table 1, Fig 5, and Table 2, we first need to prepare image pairs of input structure and texture references images.
46+
## 🔧 Installation
8847

89-
The reference images are randomly selected from the val set of LSUN, FFHQ, and the Waterfalls dataset. The pairs of input structure and texture images should be located at `input_structure/` and `input_style/` directory, with the same file name. For example, `input_structure/001.png` and `input_style/001.png` will be loaded together for swapping.
48+
### Step One - install python dependencies
9049

91-
Replace the path to the test images at `dataroot="./testphotos/church/fig5_tab2/"` field of the script `experiments/church_pretrained_launcher.py`, and run
92-
```bash
93-
python -m experiments church_pretrained test swapping_for_eval
94-
python -m experiments ffhq1024_pretrained test swapping_for_eval
50+
```shell
51+
$ poetry install
9552
```
9653

97-
The results can be viewed at `./results` (that can be changed using `--result_dir` option).
98-
99-
The FID is then computed between the swapped images and the original structure images, using https://github.com/mseitzer/pytorch-fid.
54+
### Step Two - Download pretrained models
10055

101-
## Model Training.
56+
Head over to the [Testing and Evaluation section](https://github.com/taesungp/swapping-autoencoder-pytorch#testing-and-evaluation) of the official implementation of the paper and download the pretrained models and unzip them, put the checkpoints at `./checkpoints/`, you can change this location by specifying it at [`api/const.py:7`](https://github.com/saxenabhishek/swapping-autoencoder-pytorch/blob/febc81d644847324fb78a3414b97f330bfe84021/api/const.py#L7)
10257

103-
### Datasets
58+
## 🏁 Quick Start
10459

105-
- *LSUN Church and Bedroom* datasets can be downloaded [here](https://github.com/fyu/lsun). Once downloaded and unzipped, the directories should contain `[category]_[train/val]_lmdb/`.
106-
- [*FFHQ datasets*](https://github.com/NVlabs/ffhq-dataset) can be downloaded using this [link](https://drive.google.com/file/d/1WvlAIvuochQn_L_f9p3OdFdTiSLlnnhv/view?usp=sharing). This is the zip file of 70,000 images at 1024x1024 resolution. Unzip the files, and we will load the image files directly.
107-
- The *Flickr Mountains* dataset and the *Flickr Waterfall* dataset are not sharable due to license issues. But the images were scraped from [Mountains Anywhere](https://flickr.com/groups/62119907@N00/) and [Waterfalls Around the World](https://flickr.com/groups/52241685729@N01/), using the [Python wrapper for the Flickr API](https://github.com/alexis-mignon/python-flickr-api). Please contact [Taesung Park](http://taesung.me/) with title "Flickr Dataset for Swapping Autoencoder" for more details.
108-
109-
### Training Scripts
110-
111-
The training configurations are specified using the scripts in `experiments/*_launcher.py`. Use the following commands to launch various trainings.
112-
113-
```bash
114-
# Modify |dataroot| and |checkpoints_dir| at
115-
# experiments/[church,bedroom,ffhq,mountain]_launcher.py
116-
python -m experiments church train church_default
117-
python -m experiments bedroom train bedroom_default
118-
python -m experiments ffhq train ffhq512_default
119-
python -m experiments ffhq train ffhq1024_default
120-
121-
# By default, the script uses GPUtil to look at available GPUs
122-
# on the machine and sets appropriate GPU IDs. To specify specific set of GPUs,
123-
# use the |--gpu| option. Be sure to also change |num_gpus| option in the corresponding script.
124-
python -m experiments church train church_default --gpu 01234567
60+
**Streamlit Interface**
12561

62+
```sh
63+
$ streamlit run streamlit_interface.py
12664
```
12765

128-
The training progress can be monitored using `visdom` at the port number specified by `--display_port`. The default is https://localhost:2004. For reference, the training takes 14 days on LSUN Church 256px, using 4 V100 GPUs.
66+
## 📦 Inside the box
12967

130-
Additionally, a few swapping grids are generated using random samples of the training set.
131-
They are saved as webpages at `[checkpoints_dir]/[expr_name]/snapshots/`.
132-
The frequency of the grid generation is controlled using `--evaluation_freq`.
68+
Checkout our [wiki](https://github.com/saxenabhishek/swapping-autoencoder-pytorch/wiki) for more details
13369

134-
All configurable parameters are printed at the beginning of training. These configurations are spreaded throughout the codes in `def modify_commandline_options` of relevant classes, such as `models/swapping_autoencoder_model.py`, `util/iter_counter.py`, or `models/networks/encoder.py`. To change these configuration, simply modify the corresponding option in `opt.specify` of the training script.
70+
## 📜 License
13571

136-
The code for parsing and configurations are at `experiments/__init__.py, experiments/__main__.py, experiments/tmux_launcher.py`.
72+
`saxenabhishek/swapping-autoencoder-pytorch` is available under the MIT license. See the LICENSE file for more info.
13773

138-
### Continuing training.
74+
## 🤝 Contributing
13975

140-
The training continues by default from the last checkpoint, because the `--continue_train` option is set True by default.
141-
To start from scratch, remove the checkpoint, or specify `continue_train=False` in the training script (e.g. `experiments/church_launcher.py`).
76+
Please read [`Contributing.md`](https://github.com/SRM-IST-KTR/template/blob/main/Contributing.md) for details on our code of conduct, and the process for submitting pull requests to us.
14277

143-
## Code Structure (Main Functions)
78+
## ⚙️ Maintainers
14479

145-
- `models/swapping_autoencoder_model.py`: The core file that defines losses, produces visuals.
146-
- `optimizers/swapping_autoencoder_optimizer.py`: Defines the optimizers and alternating training of GAN.
147-
- `models/networks/`: contains the model architectures `generator.py`, `discriminator.py`, `encoder.py`, `patch_discrimiantor.py`, `stylegan2_layers.py`.
148-
- `options/__init__.py`: contains basic option flags. BUT many important flags are spread out over files, such as `swapping_autoencoder_model.py` or `generator.py`. When the program starts, these options are all parsed together. The best way to check the used option list is to run the training script, and look at the console output of the configured options.
149-
- `util/iter_counter.py`: contains iteration counting.
80+
| <p align="center">![Abhishek Saxena](https://github.com/saxenabhishek.png?size=128)<br>[Abhishek Saxena](https://github.com/saxenabhishek)</p> |
81+
| ---------------------------------------------------------------------------------------------------------------------------------------------- |
15082

151-
## Change Log
152-
153-
- 4/14/2021: The configuration to train the pretrained model on the Mountains dataset had not been set correctly, and was updated accordingly.
154-
155-
## Bibtex
156-
If you use this code for your research, please cite our paper:
157-
```
158-
@inproceedings{park2020swapping,
159-
title={Swapping Autoencoder for Deep Image Manipulation},
160-
author={Park, Taesung and Zhu, Jun-Yan and Wang, Oliver and Lu, Jingwan and Shechtman, Eli and Efros, Alexei A. and Zhang, Richard},
161-
booktitle={Advances in Neural Information Processing Systems},
162-
year={2020}
163-
}
164-
```
165-
## Acknowledgment
83+
## 💥 Contributors
16684

167-
The StyleGAN2 layers heavily borrows (or rather, directly copies!) the PyTorch implementation of [@rosinality](https://github.com/rosinality/stylegan2-pytorch). We thank Nicholas Kolkin for the helpful discussion on the automated content and style evaluation, Jeongo Seo and Yoseob Kim for advice on the user interface, and William T. Peebles, Tongzhou Wang, and Yu Sun for the discussion on disentanglement.
85+
<a href="https://github.com/saxenabhishek/swapping-autoencoder-pytorch/graphs/contributors">
86+
<img src="https://contrib.rocks/image?repo=saxenabhishek/swapping-autoencoder-pytorch" alt="Contributors">
87+
</a>

0 commit comments

Comments
 (0)