Skip to content

Latest commit



455 lines (271 loc) · 32.1 KB

File metadata and controls

455 lines (271 loc) · 32.1 KB

Model Zoo

Collection of more deep learning models, with tutorials on how to run them on paperspace.


  1. AttnGAN
  2. Google's Neuron-based Deep Dream
  3. Class-Based Deep Dream
  4. 2D Neural Style Transfer
  5. 2D to 3D Deep Dreaming
  6. 2D to 3D Style Transfer
  7. 2D to 3D Vertex Optimizatoin
  8. Pix2PixHD
  9. Pix2Pix
  10. CycleGAN
  11. PG-GAN
  12. StyleGan2
  13. Tree-GAN

AttnGAN for Image generation from Text

Pytorch implementation for reproducing AttnGAN results in the paper AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks by Tao Xu, Pengchuan Zhang, Qiuyuan Huang, Han Zhang, Zhe Gan, Xiaolei Huang, Xiaodong He. (This work was performed when Tao was an intern with Microsoft Research). Currently, it uses the weights trained on COCO dataset.

Running AttnGAN

  • Upload and unzip the file in the storage folder using Jupyter notebook
  • In the coco folder modify captions.txt file for the desired text/caption input
  • Start an experiment with the following parameters in paperspace:
    • Container: brannondorsey/docker-stackgan
    • Workspace:
    • Command: bash
  • Generated images will be dumped in the storage/coco/coco_AttnGAN2/captions directory for each input sentence in the captions.txt file. Each sentence generates 5 image files and the file with *_g2.png extension is the final output.

Google Deep Dream

DeepDream is an experiment that visualizes the patterns learned by a neural network. Similar to when a child watches clouds and tries to interpret random shapes, DeepDream over-interprets and enhances the patterns it sees in an image. It does so by forwarding an image through the network, then calculating the gradient of the image with respect to the activations of a particular layer. The image is then modified to increase these activations, enhancing the patterns seen by the network, and resulting in a dream-like image. This process was dubbed "Inceptionism". This is Google's implementation of deep dream that is used for their web UI.

This code allows you to augment an input image with the learned features for a specifc neuron or every neuron in the final layer of a classification neural network. Thus, for a single input image, this code will either generate a single output image, or generate 143 output images, each using the learned features from a different neuron. When using a pre-trained network, you first need to upload the file folder to /storage using the notebook tool in the web GUI, and unzip it. What is nice about using this code in leu of the google web UI is that this code takes in an image of any resolution!

2D deep dream Docker container image:



Command Format:


Where IMAGE_DATA is the location of your input image on paperspace, WHICH_NEURON is a number 0-143 to specify which neuron you would like to use for dreaming. If you would like to use all of them, specify 'all' instead of a number. MODEL_DIR is the location of inception5_weights in paperspace, RESULTS_DIR is the location in paperspace where the output image(s) will be saved, and NUM_ITERS determines how many times the algorithm will perform the dreaming operation (more iterations will have a stronger dreaming effect in the input image).

Command Example:

bash /storage/2Dmodels/tree1.jpg /storage/inception5h_weights/tensorflow_inception_graph.pb /storage/test_dream 30

Class-based Deep Dream

This code allows you to augment an input image with the learned features of a specific class from a trained classification neural network, specifically a VGG network. This technique is used for visualising the class models learnt by the image classification ConvNets. Given a learnt/trained classification ConvNet and a class of interest from the network's training dataset, this visualisation method generates an image with features that are representative of what the ConvNet has learned to represent/detect the given class; it lets us know what are the features expected in input image to maximize the output class node score.

Evaluating 2D class-based deep dream

When using a pre-trained network, you first need to upload pretrained VGG neural network weights folder to /storage using the notebook tool in the web GUI. Note that the next subsection details how to train a VGG network on your own dataset. Note that the input must be in RGB format (i.e., three channels).

2D deep dream Docker container image:



Command Format:


Where IMAGE_DATA is the location of your input image on paperspace, WEIGHTS_DIR is the location of the VGG network weights in paperspace,DREAM_CLASS is the class you would like to use for dreaming. If you would like to use all of them, specify 'all' instead of a number. RESULTS_DIR is the location in paperspace where the output image(s) will be saved, and NUM_ITERS determines how many times the algorithm will perform the dreaming operation (more iterations will have a stronger dreaming effect in the input image), IMAGE_H and IMAGE_W are the output image height and width, respectively.

Command Example:

bash /storage/2Dmodels/scene0_camloc_0_5_-20_rgb.png /storage/acadia_general_arch_styles_netweights gothic /storage/test 500 720 1280

Training a classifcation network on your dataset to use for 2D class-based deep dream

This code produces a trained VGG network that can be used to perform 2D class-based dreaming (as described in the previous section). Note that the weights directory in the below command is where the weights will be saved. You will need to upload your classification dataset to /storage before running this code. It assumes that your dataset contains a folder for each different class of images, and that each of these folders contains images only for that class.

2D deep dream Docker container image:



Command Format:


Where TRAIN_IMAGE_DIR is the location of your classification training dataset on paperspace, TRAIN_EPOCHS determines how long your classification network will train for, WEIGHTS_DIR is the location of where the finetuned VGG network weights will be saved in paperspaced, , IMAGE_H and IMAGE_W are the desired image height and width that your trained VGG network will operate on, respectively. Note that the network can only operate on images that are the same size as it has been trained for.

Command Example:

bash /storage/classificationdataset /storage/acadia_general_arch_styles_netweights gothic /storage/test 500

Neural Style Transfer

Neural style transfer is an optimization technique used to take two images—a content image and a style reference image (such as an artwork by a famous painter)—and blend them together so the output image looks like the content image, but “painted” in the style of the style reference image. This is implemented by optimizing the output image to match the content statistics of the content image and the style statistics of the style reference image. These statistics are extracted from the images using a convolutional network. You can run this code with no mask, which means that the style will be transferred to the entire content image, or with a mask, which will apply the style to only the locations in the content image specified by the mask. Currently, the mask is a binary mask the same size as the content image, where a pixel value of 1 indicates a region to transfer style to and 0 indicates where the content image will be unaffected by the style transfer.

Running 2D style transfer

First, you will need to create notebook via web GUI, upload vgg weights, called imagenet-vgg-verydeep-19.mat, to /storage You will also need to upload a content image and style guide image, and if desired the mask image, to /storage. You can download the vgg weights from

2D style transfer Docker container image:



Command Format for running with no mask:


Command Example for running with no mask:

bash /storage/2Dmodels/robotics_building_satellite.png /storage/2Dmodels/new000343.png /artifacts 500 5.0 1.0 100

Command Format for running with mask:


Command Example for running with mask:

bash /storage/2Dmodels/robotics_building_satellite.png /storage/2Dmodels/new000343.png /storage/2Dmodels/new000343_mask.png /artifacts/styleoutputdir 500 5.0 1.0 100

Running 2D to 3D neural renderer for 3D deep dreaming

This project uses the neural 3D mesh renderer (CVPR 2018) by H. Kato, Y. Ushiku, and T. Harada to achieve dreaming and style transfer in 3D. It builds upon the code in (

Note that before running any jobs in this project, you will need to upload the desired 3D models (in .obj format) to the paperspace /storage space. Add each 3D model to /storage/3Dmodels. You do not need to worry about uploading pretrained weights, the code handles this under the hood.

Neural Renderer Docker container image:



Command Format:


Command Example:

bash /storage/3Dmodels/bench.obj 3Ddreamed_bench.gif /artifacts/results_3D_dream 512 300

Running 2D to 3D neural renderer for 2D to 3D style transfer

This project uses the neural 3D mesh renderer (CVPR 2018) by H. Kato, Y. Ushiku, and T. Harada to achieve dreaming and style transfer in 3D. It builds upon the code in (

Note that before running any jobs in this project, you will need to upload the desired 3D models (in .obj format) to the paperspace /storage space. Add each 3D model to /storage/3Dmodels and any 2D models (i.e., images) to /storage/2Dmodels. You do not need to worry about uploading pretrained weights, the code handles this under the hood.

Neural Renderer Docker container image:



Command Format:


Command Example:

bash /storage/3Dmodels/TreeCartoon1_OBJ.obj /storage/2Dmodels/new000524.png 2Dgeo_3Dtree.gif /artifacts/results_2D_to_3D_styletransfer 1.0 2e9 1000

Running 2D to 3D neural renderer for 2D to 3D vertex optimization

This project uses the neural 3D mesh renderer (CVPR 2018) by H. Kato, Y. Ushiku, and T. Harada to achieve dreaming and style transfer in 3D. It builds upon the code in (

Note that before running any jobs in this project, you will need to upload the desired 3D models (in .obj format) to the paperspace /storage space. Add each 3D model to /storage/3Dmodels and any 2D models (i.e., images) to /storage/2Dmodels. You do not need to worry about uploading pretrained weights, the code handles this under the hood.

Neural Renderer Docker container image:



Command Format:


Command Example:

bash /storage/3Dmodels/TreeCartoon1_OBJ.obj /storage/2Dmodels/new000524.png 2Dgeo_3Dtree /artifacts/results_vertoptim 250

Pix2pixHD for paired image-to-image translation

We will step through how to train and test the super resolution GAN model, pix2pixHD in the Paperspace Experiment Builder.

First, pix2pixHD is a generative adversarial neural network that transforms one dataset of high resolution images, which we refer to as data domain A, into the style of a different high resolution dataset, which we refer to as data domain B. Note that the data in domain A must be paired with the data in domain B; this means that the spatial structure of an image in domain A must correpsond to an image in domain D that has the same spatial structure; a good example of this is having domain A be a collection of semantic segmentation maps (each object class is a different color pixel) and domain B is the segmentation maps corresponding RGB image. This means that, effectively, pix2pixHD is painting the RGB color palette/textures of the shapes in domain B onto the appropriate shapes in domain A. We use the term domain to desrcibe a dataset because a limited amount of visual information is captured in each dataset, and can vary greatly between datasets. Thus, in a sense, each dataset is it's own little world, or domain! The easiest way to understand this is by thinking of a dataset of purely daytime images compared to a dataset of purely night time images. While both datasets may capture similar structures (buildings, roads cars, people etc), the overall appearance/style is drastically different. The pix2pixHD was originally developed to transform semantically segmented maps into corresponding images, but it can be trained to transfer any one dataset into the style of a different dataset.

The pix2pixHD Docker container you can use for both training and testing your model:


The workspace you can use for both training and testing your model:

Training pix2pixHD

For training pix2pixHD, you will need to upload your input data domain A to /storage/train_A, and your output data domain B to /storage/train_B. AS A REMINDER, the pix2pixHD model requires that the images in domain A are paired with images in domain B; this means that the spatial structure for a pair of images should be similar. For example, domain A could be semantic segmentation maps and domain B would be the corresponding RGB images, and a pair of images would be the semantic segmentation map of a specific scene and the corresponding RGB image. Because of this requirement, the filenames will need to be the same for image pairs. For example, an image pair would be /storage/train_A/0001.png and /storage/train_B/0001.png. Note that you will also need to create a folder, /storage/checkpoints_dir, where your model weights and intermediate generated images will be saved during training. For more information please visit the pix2pix github repository, which includes instructions for training and testing.

Command Format:

python --name <RUNNAME> --dataroot /storage/example_dataset --checkpoints_dir /storage/checkpoints --label_nc 0 --no_instance

Testing pix2pixHD

For testing pix2pixHD, you will need to upload your input data domain A to /storage/test_A. You will also need trained network weights, which should be stored in /storage/checkpoints_dir.

Command Format:

python --name <RUNNAME_OF_TRAINED_NETWORK> --dataroot /storage/example_dataset --checkpoints_dir /storage/checkpoints_from_training --results_dir /artifacts/pix2pixhd_testoutputs --resize_or_crop none $@

Pix2pix for paired image-to-image translation

pix2pix is a generative adversarial neural network that transforms one dataset of images, which we refer to as data domain A, into the style of a different dataset, which we refer to as data domain B. Note that the data in domain A must be paired with the data in domain B; this means that the spatial structure of an image in domain A must correpsond to an image in domain D that has the same spatial structure; a good example of this is having domain A be a collection of semantic segmentation maps (each object class is a different color pixel) and domain B is the segmentation maps corresponding RGB image.

The workspace you can use for both training and testing your model:

The pix2pixHD Docker container you can use for processing data:


Processing Data

For training create folder /path/to/data with subfolders A and B. A and B should each have their own subfolders train, etc. In /path/to/data/A/train, put training images in style A. In /path/to/data/B/train, put the corresponding images in style B. Corresponding images in a pair {A,B} must be the same size and have the same filename, e.g., /path/to/data/A/train/1.jpg is considered to correspond to /path/to/data/B/train/1.jpg.

Change the --fold_A, --fold_B and --fold_B to your own domain A's path, domain B's path and output directory path.

Command Format:

python datasets/ --fold_A /storage/example_dataset/A --fold_B /storage/example_dataset/B --fold_AB /storage/example_dataset/data

Note : We have to run it only once every time there is a change in dataset.

The pix2pixHD Docker container you can use for both training and testing your model:


Training pix2pix

Change the --dataroot, --name and --checkpoints_dir to your own dataset's path, model's and checkpoint directory name.

Command Format:

python --dataroot /storage/example_dataset/data --name experiment --checkpoints_dir /storage/ckp --n_epochs 3 --batch_size 1 --load_size 286 --crop_size 256 --model pix2pix --netG unet_256 --direction AtoB --lambda_L1 100 --dataset_mode aligned --norm batch --pool_size 0

Testing pix2pix

For testing pix2pix, you will need to upload your input data domain A to /storage/example_dataset/test_A. You will also need trained network weights, which should be stored in /storage/checkpoints_dir.

Command Format:

python --dataroot /storage/example_dataset/test_A --name experiment --checkpoints_dir /storage/ckp --results_dir /artifacts --batch_size 1 --load_size 256 --crop_size 256 --model test --netG unet_256 --direction AtoB --dataset_mode single --norm batch

Progressive growing of GANs (PG-GAN)

PG-GAN functions like a standard GAN framework: the generator neural network takes in a latent noise vector and projects it into the pixel space of RGB images that constitute the 'real' dataset you wish the GAN to model. The Discriminator network determines if its input image is real or fake (i.e, rendered by the Generator network). Each network is influenced by the others error, which trains the Generator to produce highly realistic images. After training, the Discrimiantor network is discarded and the Generator is used to produce novel images that would reasonably come from the training dataset, but do not exist in the training dataset.

In the paperspace persistent storage (using the jupyter notebook) you will need to create the folder /storage/pggan_dataset and upload your training image dataset there. This training folder should contain your training images (jpg or png); the naming convention of the images does not matter. For example, a given image named train_img-1.png should be located at /storage/pggan_dataset/train_img-1.png. They should all be resized and/or cropped to 1024 by 1024 (you can also do 512x512 or attempt 2048x2048; note that the larger images take much longer to run)

The PG-GAN Docker container that you can use for both training and testing the model:


The workspace you can use for both training and testing your model:

Training PG-GAN

The command format used for training a model:

python PGAN -c dataset.json -d OUTPUT_DIR -n EXPERIMENT_NAME --no_vis

where EXPERIMENT_NAME is a name you create for your model, and /storgae/OUTPUT_DIR is where the training outputs are stored. We recommend that you create the OUTPUT_DIR directory in storage so you will be able to access intermediate images produced during the training process, which can take days to weeks depending upon the image size. The dataset.json file is part of the PG-GAN training framework that specifies your dataset is located at /storage/pggan_dataset, and thus needs to be included in the training command.

Testing PG-GAN

The command format used for testing an already-trained model; note that EXPERIMENT_NAME should match the one you used to train the model.


EXPERIMENT_NAME is the same name you used in training, PATH_TO_THE_OUTPUT_DATASET is where your output images will be saved, and NUMBER_IMAGES_IN_THE_OUTPUT_DATASET is the number of images you would like to output, and CHECKPOINT_LOCATION is the location of your network checkpoitns/weights in paperspace storage.

CycleGAN for unpaired image-to-image translation:

The paper proposes a method that can capture the characteristics of one image domain and figure out how these characteristics could be translated into another image domain, all in the absence of any paired training examples. CycleGAN uses a special cycle consistency loss to enable training without the need for paired data. In other words, it can translate from one domain to another without a one-to-one mapping between the source and target domain. This opens up the possibility to do a lot of interesting tasks like photo-enhancement, image colorization, style transfer, etc. All you need is the source and the target dataset (which is simply a directory of images). Check out their github page at to see some cool examples of what this framework can do! Your training dataset should have subfolders trainA, where you should load your training images for dataset/domain A, trainB, where you should load your training images for dataset/domain B, testA, where you should load your test images for dataset/domain A that will be transferred to the style of domain B after training. You will need to create a folder to store the model weights in /storage, which is the CHECKPT_DIR below.

The cycleGAN Docker container you can use for both training and testing your model:


The workspace you can use for both training and testing your model:

Training cycleGAN

The command format used for training a model:

python --dataroot DATASET_PATH --name EXPERIMENT_NAME --checkpoints_dir CHECKPT_DIR --load_size RESIZE --crop_size CROP_SIZE --model cycle_gan

where DATASET_PATH is the location of your dataset on papespace, EXPERIMENT_NAME is a name you create for your model, CHECKPT_DIR is the location where the weights and output images will be save, RESIZE is a number that your images will be scaled to, CROP_SIZE is what the images will be cropped to after scaling. the --load_size and --crop_size flags are optional.

Command Example:

python --dataroot /storage/cyclegan-dataset --name train_cyclegan --checkpoints_dir /storage/cgan_checkpts --load_size 256 --crop_size 128 --model cycle_gan --display_id 0

To see intermediate training image results, check out CHECKPT_DIR/EXPERIMENT_NAME/web/index.html

Testing cycleGAN

The command format used for testing an already-trained model:

python --dataroot DATASET_PATH --name EXPERIMENT_NAME --checkpoints_dir CHECKPT_DIR --model cycle_gan

Command Example:

python --dataroot /storage/cyclegan-dataset --name mapstest_cyclegan --checkpoints_dir /storage/cgan_checkpts --model cycle_gan --results_dir /artifacts

Note that EXPERIMENT_NAME needs to be the same one you used to train the model/generate the weights, similar with CHECKPT_DIR. The test results will be saved to a html file here: /artifacts/results/EXPERIMENT_NAME/latest_test/index.html

StyleGAN 2

In the paperspace persistent storage (using the jupyter notebook) you will need to create the folder /storage/your_stylegan_dataset (though you could name it whatever name you'd like) and upload your training image dataset there. This training folder should contain your training images (jpg or png); the naming convention of the images does not matter. For example, a given image named train_img-1.png should be located at /storage/your_stylegan_dataset/train_img-1.png. They should all be resized and/or cropped to 1024 by 1024 (you can also do 512x512 or attempt 2048x2048; note that the larger images take much longer to run and will require heftier computing resources/memory).

The StyleGAN Docker container that you can use for both training and testing the model:


The workspace you can use for both training and testing your model:

Training StyleGAN

The command format used for training a model:


where DATASET_DIR is where input/training data is stored,EXPERIMENT_NAME is a name you create for your model, RESULTS_DIR is where the training outputs are stored and MODEL_DIR is where the the model checkpoints/weights are stored. We recommend that you create the RESULTS_DIR and MODEL_DIR directories in storage so you will be able to access intermediate images produced during the training process, which can take days to weeks depending upon the image size. IMG_SIZE is the size of the image on which you want to train on (eg, specify 512 or 1024, etc) and BATCH_SIZE is the batch size of images for training (decrease it if you run into an OOM error).

A command example for training could be:

bash myFirstStyleganExperiment /storage/your_stylegan_dataset /storage/your_stylegan_resultsdir /storage/your_stylegan_modeldir 512 4

Testing StyleGAN

Once you have trained your styleGan2 model, to use it to generate images use the follwing command:


where RESULTS_DIR is the location to save the generated images and MODEL_DIR is the location of the trained weights.

Example command:

bash /storage/your_stylegan_outputdir/ /storage/stylegan_model'


Pytorch Implementation of paper 3D Point Cloud Generative Adversarial Network Based on Tree Structured Graph Convolutions by Dong Wook Shu, Sung Woo Park, Junseok Kwon.

The Docker container that you can use for both training and testing the model:


The workspace you can use for both training and testing your model:

Training Tree-GAN

Instructions before training:

  1. Upload your data (.pts files) onto Paperspace storage i.e the full path name will be the DATASET_PATH
  2. Regarding Checkpoints:
  • Create a folder in persistent storage to store intermediate 'checkpoints' i.e the full path name will be the CKPT_PATH
  • If you are training a model from scratch - the 0th epoch the argument CKPT_LOAD=None
  • If you want to resume training a model CKPT_PATH=full path to Checkpoint folder and CKPT_LOAD=checkpoint name from which you want to resume training
  1. Create a folder in storage to save 'Result' i.e the full path name will be the RESULT_PATH When you start training, the intermediate .pts files will be found under RESULT_PATH/Points and RESULT_PATH/Matplot_Images will contain plotted images of the pointclouds
  2. Let BATCH_SIZE = 20, Note, if you get an out of memory error (OOM), reduce BATCH_SIZE until you get rid of OOM error POINT_NUM=4096 Maximum number of points used to generate pointcloud for your data models
  3. Train for EPOCHS=1000, view intermediate results and select best model from saved checkpoints
  4. SAVE_AT_EPOCH , this argument creates a checkpoint at every SAVE_AT_EPOCH iteration. For example, if EPOCHS=1000 and SAVE_AT_EPOCH=10, a checkpoint will be created at every 10th epoch (a total of 100 checkpoints will be created)

The command format used for training a model:

Command Format for Training:


A command example for training from scratch could be:

bash /storage/Church/TreeGAN/Point 15 4096 /storage/Church/TreeGAN/Checkpoint None /storage/Church/TreeGAN/Result 1000 10

A command example for training to resume training could be:

bash /storage/Church/TreeGAN/Point 15 4096 /storage/Church/TreeGAN/Checkpoint /storage/Church/TreeGAN/Result 1000 10

DATASET_PATH is where input/training data is stored.BATCH_SIZE is the batch size of images for training (decrease it if you run into an OOM error).POINT_NUM=4096 is the maximum number of points used to generate pointcloud for your data models. CKPT_PATH path to checkpoint directory where intermediate model checkpoints are stored. CKPT_LOAD, you can provide the intermediate checkpoint model to be loaded if you want to resume training. Note that CKPT_LOAD will be a .pt file which will be present in CKPT_PATH directory. If you are training from scratch let CKPT_LOAD=None. RESULT_PATHwhen you begin training, the intermediate .pts files will be found under RESULT_PATH/Points and RESULT_PATH/Matplot_Images will contain plotted images of the pointclouds.EPOCHS, specify number of epochs to train the model. SAVE_AT_EPOCH, this argument creates a checkpoint at every SAVE_AT_EPOCH iteration.

Testing Tree-GAN

Instructions before testing

  1. CKPT_PATH should contain the full path to the checkpoint folder, CKPT_LOAD should contain the name of the checkpoint file present in CKPT_PATH
  2. Create a folder in storage or artifacts in which the images generated will be saved. These 3D images are for your reference to have a quick check on how the model is performing.The full path for these images should be provided in SAVE_IMAGES
  3. Create a folder in storage to save the .pts files generated. This folder can be zipped and downloaded onto your local system. The full path for these images should be provided in SAVE_PTS_FILES
  4. SEED Random integer number, keep changing this number to get different results from the trained model

Once you have trained your Tree-GAN model, to use it to generate pointclouds use the follwing command:

Command Format for Testing:


Example of a Command used for Testing:

bash 4096 /storage/TreeGAN_dataset/Paper_checkpoints/model/checkpoints /storage/TreeGAN_dataset/Testing_images /storage/TreeGAN_dataset/Testing_Pts_files 52