resort

yzy1996 · Oct 25, 2022 · 3827cec · 3827cec
1 parent 1648ae4
commit 3827cec
Show file tree

Hide file tree

Showing 33 changed files with 84 additions and 188 deletions.
diff --git a/6-AutoDecoder (AD)/Auto-Decoder.md → ...er (VAE)/AutoDecoder (AD)/Auto-Decoder.md b/6-AutoDecoder (AD)/Auto-Decoder.md → ...er (VAE)/AutoDecoder (AD)/Auto-Decoder.md
diff --git a/6-AutoDecoder (AD)/README.md → ...oEncoder (VAE)/AutoDecoder (AD)/README.md b/6-AutoDecoder (AD)/README.md → ...oEncoder (VAE)/AutoDecoder (AD)/README.md
diff --git a/1-Variational AutoEncoder (VAE)/README.md b/1-Variational AutoEncoder (VAE)/README.md
@@ -1,5 +1,29 @@
 # Variational Auto-Encoder (VAE)
 
+The majority of the research efforts on improving VAEs is dedicated to the statistical challenges, such as:
+
+- reducing the gap between approximate and true posterior distribution
+- formulatig tighter bounds
+- reducing the gradient noise
+- extending VAEs to discrete variables
+- tackling posterior collapse
+- designing special network architectures
+  - previous work just borrows the architectures from the classification tasks
+
+
+
+VAEs maximize the mutual information between the input and latent variables, requiring the networks to retain the information content of the input data as much as possible.
+
+Information maximization in noisy channels: A variational approach  
+**[`NeurIPS 2017`]**
+
+Deep variational information bottleneck  
+**[`ICLR 2017`]**
+
+
+
+
+
 学习资料
 
 https://jaan.io/what-is-variational-autoencoder-vae-tutorial/
@@ -78,3 +102,12 @@ VQVAE通过Encoder学习出中间编码，然后通过最邻近搜索将中间
 
 【这样其实实现的是一种压缩的效果】
 
+
+
+
+
+参考：
+
+https://www.jeremyjordan.me/variational-autoencoders/
+
+https://www.jeremyjordan.me/autoencoders/
diff --git a/5-Diffusion Model/Learning Material/具体实现.md → 2-Diffusion Model/Learning Material/具体实现.md b/5-Diffusion Model/Learning Material/具体实现.md → 2-Diffusion Model/Learning Material/具体实现.md
diff --git a/5-Diffusion Model/Learning Material/加速算法.md → 2-Diffusion Model/Learning Material/加速算法.md b/5-Diffusion Model/Learning Material/加速算法.md → 2-Diffusion Model/Learning Material/加速算法.md
diff --git a/5-Diffusion Model/Learning Material/变种介绍.md → 2-Diffusion Model/Learning Material/变种介绍.md b/5-Diffusion Model/Learning Material/变种介绍.md → 2-Diffusion Model/Learning Material/变种介绍.md
diff --git a/...sion Model/Learning Material/扩散模型理论推导.md → ...sion Model/Learning Material/扩散模型理论推导.md b/...sion Model/Learning Material/扩散模型理论推导.md → ...sion Model/Learning Material/扩散模型理论推导.md
diff --git a/5-Diffusion Model/README.md → 2-Diffusion Model/README.md b/5-Diffusion Model/README.md → 2-Diffusion Model/README.md
diff --git a/2-Energy-Based Models (EBM)/README.md → 3-Energy-Based Model (EBM)/README.md b/2-Energy-Based Models (EBM)/README.md → 3-Energy-Based Model (EBM)/README.md
diff --git a/3-Nomalizing Flow/Normalizing flow.md → 4-Flow/Normalizing flow.md b/3-Nomalizing Flow/Normalizing flow.md → 4-Flow/Normalizing flow.md
diff --git a/3-Nomalizing Flow/Notes.md → 4-Flow/Notes.md b/3-Nomalizing Flow/Notes.md → 4-Flow/Notes.md
diff --git a/3-Nomalizing Flow/README.md → 4-Flow/README.md b/3-Nomalizing Flow/README.md → 4-Flow/README.md
diff --git a/11-Representation Learning/Notes.md → 5-Representation Learning/Notes.md b/11-Representation Learning/Notes.md → 5-Representation Learning/Notes.md
diff --git a/11-Representation Learning/README.md → 5-Representation Learning/README.md b/11-Representation Learning/README.md → 5-Representation Learning/README.md
diff --git a/12-Disentangled Representations/README.md → 6-Disentangled Representation/README.md b/12-Disentangled Representations/README.md → 6-Disentangled Representation/README.md
diff --git a/CLIP/CLIP.png → 7-Text-to-Image/CLIP/CLIP.png b/CLIP/CLIP.png → 7-Text-to-Image/CLIP/CLIP.png
diff --git a/CLIP/README-Notes.md → 7-Text-to-Image/CLIP/README-Notes.md b/CLIP/README-Notes.md → 7-Text-to-Image/CLIP/README-Notes.md
diff --git a/CLIP/test.py → 7-Text-to-Image/CLIP/test.py b/CLIP/test.py → 7-Text-to-Image/CLIP/test.py
diff --git a/13-Text-to-Image/Notes.md → 7-Text-to-Image/Notes.md b/13-Text-to-Image/Notes.md → 7-Text-to-Image/Notes.md
diff --git a/13-Text-to-Image/README.md → 7-Text-to-Image/README.md b/13-Text-to-Image/README.md → 7-Text-to-Image/README.md
diff --git a/Evaluation & Loss/0-Evaluation/KL散度.md → 8-Evaluation & Loss/0-Evaluation/KL散度.md b/Evaluation & Loss/0-Evaluation/KL散度.md → 8-Evaluation & Loss/0-Evaluation/KL散度.md
diff --git a/...on & Loss/0-Evaluation/Perceptual Loss.md → ...on & Loss/0-Evaluation/Perceptual Loss.md b/...on & Loss/0-Evaluation/Perceptual Loss.md → ...on & Loss/0-Evaluation/Perceptual Loss.md
diff --git a/Evaluation & Loss/0-Evaluation/REAMDME.md → 8-Evaluation & Loss/0-Evaluation/REAMDME.md b/Evaluation & Loss/0-Evaluation/REAMDME.md → 8-Evaluation & Loss/0-Evaluation/REAMDME.md
diff --git a/Evaluation & Loss/0-Evaluation/loss.md → 8-Evaluation & Loss/0-Evaluation/loss.md b/Evaluation & Loss/0-Evaluation/loss.md → 8-Evaluation & Loss/0-Evaluation/loss.md
diff --git a/...on & Loss/0-Evaluation/perceptual_loss.py → ...on & Loss/0-Evaluation/perceptual_loss.py b/...on & Loss/0-Evaluation/perceptual_loss.py → ...on & Loss/0-Evaluation/perceptual_loss.py
diff --git a/...uation & Loss/0-Evaluation/距离 Distance.md → ...uation & Loss/0-Evaluation/距离 Distance.md b/...uation & Loss/0-Evaluation/距离 Distance.md → ...uation & Loss/0-Evaluation/距离 Distance.md
diff --git a/Evaluation & Loss/Loss function/Distance.md → ...aluation & Loss/Loss function/Distance.md b/Evaluation & Loss/Loss function/Distance.md → ...aluation & Loss/Loss function/Distance.md
diff --git a/Evaluation & Loss/Loss function/README.md → 8-Evaluation & Loss/Loss function/README.md b/Evaluation & Loss/Loss function/README.md → 8-Evaluation & Loss/Loss function/README.md
diff --git a/Evaluation & Loss/README.md → 8-Evaluation & Loss/README.md b/Evaluation & Loss/README.md → 8-Evaluation & Loss/README.md
diff --git a/Evaluation & Loss/reg summary.md → 8-Evaluation & Loss/reg summary.md b/Evaluation & Loss/reg summary.md → 8-Evaluation & Loss/reg summary.md
diff --git a/14-Animation/README.md → Others/Animation/README.md b/14-Animation/README.md → Others/Animation/README.md
diff --git a/README.md b/README.md
@@ -4,14 +4,10 @@
 [![Maintenance](https://img.shields.io/badge/Maintained%3F-yes-green.svg)](https://GitHub.com/Naereen/StrapDown.js/graphs/commit-activity)
 [![PR's Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg?style=flat)](http://makeapullrequest.com) 
 
-A collection of resources on 2D Generative Model.
+A collection of resources on 2D Generative Model which utilize generator functions that map low-dimensional latent codes to high-dimensional data outputs..
 
-A collection of resources on generative models which utilize generator functions that map low-dimensional latent codes to high-dimensional data outputs.
 
 
-
-We would define a prior distribution for the latent space, however this prior may not match the true and agnostic data manifold. It’s an obstacles yielding less accurate generation.
-
 ## Contributing
 
 Feedback and contributions are welcome! If you think I have missed out on something (or) have any suggestions (papers, implementations and other resources), feel free to pull a request or leave an issue. I will release the [latex-pdf version]() in the future. :arrow_down:markdown format:
@@ -24,46 +20,58 @@ Feedback and contributions are welcome! If you think I have missed out on someth
 
 :smile: Now you can use this [script](https://github.com/yzy1996/Python-Code/tree/master/Python%2BarXiv) to automatically generate the above text.
 
-## Category
 
-**3D-Aware Generation** has been moved to **[Learn 3D from 2D](https://github.com/yzy1996/Awesome-Learn-3D-From-2D)** 
+
+## Contents
 
 **GAN related sources** has been moved to **[GAN](https://github.com/yzy1996/Awesome-GANs)**
 
+**3D-Aware Generation** has been moved to **[Learn 3D from 2D](https://github.com/yzy1996/Awesome-Learn-3D-From-2D)** 
+
+
+
+1. [Variational AutoEncoder (VAE)](./1-Variational-AutoEncoder-(VAE))
+2. [Diffusion Model](./2-Diffusion-Model)
+3. [Energy-Based Model (EBM)](./3-Energy-Based-Model-(EBM))
+4. [Flow](./4-Flow)
+5. [Representation Learning](./5-Representation-Learning)
+6. [Disentangled Representation](./6-Disentangled-Representation)
+7. [Text-to-Image](./7-Text-to-Image)
+8. [Evaluation & Loss](./8-Evaluation-&-Loss)
+9. [Others](./Others)
+
 
 
 ## Introduction
 
-photorealistic image synthesis
+![img](https://raw.githubusercontent.com/yzy1996/Image-Hosting/master/generative-overview.png)
 
-- high resolution cc
-- content controllable
 
 
+<details><summary>中文介绍</summary><p>
 
-compositional nature of scenes
+表征（representation）和重构（reconstruction）一直是不分家的两个研究话题。
 
-- individual objects' shapes
-- appearances
-- background
+核心目标是重构，但就像我看到一幅画面，想要转述给另一个人，让他也想象出这个画面的场景，人会将这幅画抽象为一些特征，例如这幅画是自然风光，有很多树，颜色很绿，等等。然后另一个人再根据这些描述，通过自己预先知道的人生阅历，就能还原这幅画。或者就像公安在找犯人的时候，需要通过描述嫌疑人画像。是通过一些特征在刻画的。
 
+机器同样也需要这样一套范式，只不过可能并不像人一样的语意理解。为了可解释性，以及可控性，我们是希望机器能按照人能理解的一套特征来。
 
+</p>
+</details>
 
-Modern computer graphics (CG) techniques have achieved impressive results and are industry standard in gaming and movie productions. However, they are very hardware and computing expensive and require substantial repetitive labor. 
 
-Therefore, the ability to generate and manipulate photorealistic image content is a long-standing goal of computer vision and graphics.
 
-There models try to model the real world by generating realistic samples from latent representations.
+The ability to generate and manipulate photorealistic image content (**high resolution** & **content controllable**) is a long-standing goal of computer vision and graphics. We try to model the real world by generating realistic samples from latent representations. 
 
 
 
-<Generating images with sparse representations> divide deep generative models broadly into three categories:
+Deep generative models can be divided broadly into three categories:
 
-- Generative Adversarial Networks
+- **Generative Adversarial Networks**
 
   > use discriminator networks that are trained to distinguish samples from generator networks and real examples
 
-- Likelihood-based Model
+- **Likelihood-based Model**
 
   > directly optimize the model log-likelihood or the evidence lower bound.
 
@@ -75,178 +83,11 @@ There models try to model the real world by generating realistic samples from la
 
   - autoregressive models
 
-- Energy-based Models
+- **Energy-based Models**
 
   > estimate a scalar energy for each example that corresponds to an unnormalized log-probability
   
-
-
-### VAE
-
-The majority of the research efforts on improving VAEs is dedicated to the statistical challenges, such as:
-
-- reducing the gap between approximate and true posterior distribution
-- formulatig tighter bounds
-- reducing the gradient noise
-- extending VAEs to discrete variables
-- tackling posterior collapse
-- designing special network architectures
-  - previous work just borrows the architectures from the classification tasks
-
-
-
-VAEs maximize the mutual information between the input and latent variables, requiring the networks to retain the information content of the input data as much as possible.
-
-Information maximization in noisy channels: A variational approach  
-**[`NeurIPS 2017`]**
-
-Deep variational information bottleneck  
-**[`ICLR 2017`]**
-
-
-
-
-
-表征（representation）和重构（reconstruction）一直是不分家的两个研究话题。
-
-核心目标是重构，但就像我看到一幅画面，想要转述给另一个人，让他也想象出这个画面的场景，人会将这幅画抽象为一些特征，例如这幅画是自然风光，有很多树，颜色很绿，等等。然后另一个人再根据这些描述，通过自己预先知道的人生阅历，就能还原这幅画/
-
-或者就像公安在找犯人的时候，需要通过描述嫌疑人画像。是通过一些特征在刻画的。
-
-机器同样也需要这样一套范式，只不过可能并不像人一样的语意理解
-
-为了可解释性，以及可控性，我们是希望机器能按照人能理解的一套特征来
-
-![image-20220612154943172](https://raw.githubusercontent.com/yzy1996/Image-Hosting/master/image-20220612154943172.png)
-
-
-
-AutoDecoder
-
-
-
-
-
-这里又需要提及一下重建loss
-
-
-
-## Introduction
-
-Generative models can be divided into two classes:
-
-- implicit generative models (IGMs)
-- explicit generative models (EGMs)
-
-
-
-Our goal is to train a model $\mathbb{Q}_{\theta}$ which aims to approximate a target distribution $\mathbb{P}$ over a space $\mathcal{X} \subseteq \mathbb{R}^{d}$.
-
-Normally we define $\mathbb{Q}_{\theta}$ by a generator function $G_{\theta}: \mathcal{Z} \rightarrow \mathcal{X}$, implemented as a deep network with parameters $\theta$, where $\mathcal{Z}$ is a space of latent vectors, say $\mathcal{R}^{128}$. We assume a fixed Gaussian distribution on $\mathcal{Z}$, and call $\mathbb{Q}_{\theta}$ the distribution of $G_{\theta}(Z)$. 
-
-The optimization process is to learn by minimizing a discrepancy $\mathcal{D}$ between distributions , with the property $\mathcal{D}(\mathbb{P}, \mathbb{Q}_{\theta}) \geq 0$ and $\mathcal{D}(\mathbb{P}, \mathbb{P})=0$.
-
-
-
-we can build loss $\mathcal{D}$ based on the Maximum Mean Discrepancy,
-$$
-\operatorname{MMD}_{k}(\mathbb{P}, \mathbb{Q})=\sup _{f:\|f\|_{\mathcal{H}_{k}} \leq 1} \mathbb{E}_{X \sim \mathbb{P}}[f(X)]-\mathbb{E}_{Y \sim \mathbb{Q}}[f(Y)]
-$$
-where $\mathcal{H}_k$ is the reproducing kernel Hilbert space with a kernel $k$.
-
-
-
-
-
-Wasserstein distance
-$$
-\mathcal{W}(\mathbb{P}, \mathbb{Q})=\sup _{f:\|f\|_{\text {Lip }} \leq 1} \mathbb{E}_{X \sim \mathbb{P}}[f(X)]-\mathbb{E}_{Y \sim \mathbb{Q}}[f(Y)]
-$$
-
-
-
-
-
-There are three main methods: 
-
-- VAE
-
-- GAN
-- Flow
-
-They both learn from the training data and use the learned model to generate or predict new instances.
-
-
-
-相同点：都用到了随机噪声，然后度量噪声和真实数据的分布差异
-
-不同点：GAN为了拟合数据分布，VAE为了找到数据的隐式表达，Flow建立训练数据和生成数据之间的关系
-
-GAN 和 Flow 的输入和输出都是一一对应的，而VAE不是
-
-
-
-训练的损失函数上：
-
-VAE最大化ELBO，其目的是要做最大似然估计，最大似然估计等价于最小化KL，但这个KL不是数据和噪声的KL，而是model给出的![[公式]](https://www.zhihu.com/equation?tex=p%28x%29)和数据所展示的![[公式]](https://www.zhihu.com/equation?tex=p%28x%29)之间的KL。
-
-GAN是最小化JS，这个JS也是model给出的![[公式]](https://www.zhihu.com/equation?tex=p%28x%29)和数据所展示的![[公式]](https://www.zhihu.com/equation?tex=p%28x%29)之间的。
-
-流模型训练也非常直接，也是最大似然估计。只不过因为流模型用的是可逆神经网络，因此，相比于其他两者，学习inference即学习隐含表示非常容易，
-
-
-
-
-## GAN 2014
-
-Generative Adversarial Networks (GANs) emerge as a powerful class of generative models. In particular, they are able to synthesize photorealistic images at high resolutions ($$1024 \times 1024$$) pixels which can not be distinguished. 
-
-
-
-GANs and its variants 
-
-
-
-train with adversarial methods, bypass the need of computing densities, at the expense of a good density estimation
-
-Generative adversarial networks (GANs) represent a zero-sum game between two machine players, a generator and a discriminator, designed to learn the distribution of data.
-
-
-
-> 只要能骗过Discriminator就好
-
-
-
-## VAE 2013
-
-at the cost of learning two neural networks
-
-
-
-
-
-## VAE-GAN
-
-combine VAE with GAN
-
-
-
-## Bijective GNN
-
-
-
-## Flow
-
-
-
-## Inverse Rendering / Graphics
-
-Given 2D image observations, these approaches aim to infer a 3D-structure-aware representation of the underlying scene that enables prior-based predictions about occluded parts.
-
 
 
-参考：
 
-https://www.jeremyjordan.me/variational-autoencoders/
 
-https://www.jeremyjordan.me/autoencoders/
diff --git a/结构.md b/结构.md
@@ -261,3 +261,25 @@ Neural Radiance Field (NeRF)
 
 
 
+
+
+
+
+Our goal is to train a model $\mathbb{Q}_{\theta}$ which aims to approximate a target distribution $\mathbb{P}$ over a space $\mathcal{X} \subseteq \mathbb{R}^{d}$.
+
+Normally we define $\mathbb{Q}_{\theta}$ by a generator function $G_{\theta}: \mathcal{Z} \rightarrow \mathcal{X}$, implemented as a deep network with parameters $\theta$, where $\mathcal{Z}$ is a space of latent vectors, say $\mathcal{R}^{128}$. We assume a fixed Gaussian distribution on $\mathcal{Z}$, and call $\mathbb{Q}_{\theta}$ the distribution of $G_{\theta}(Z)$. 
+
+The optimization process is to learn by minimizing a discrepancy $\mathcal{D}$ between distributions , with the property $\mathcal{D}(\mathbb{P}, \mathbb{Q}_{\theta}) \geq 0$ and $\mathcal{D}(\mathbb{P}, \mathbb{P})=0$.
+
+
+
+we can build loss $\mathcal{D}$ based on the Maximum Mean Discrepancy,
+$$
+\operatorname{MMD}_{k}(\mathbb{P}, \mathbb{Q})=\sup _{f:\|f\|_{\mathcal{H}_{k}} \leq 1} \mathbb{E}_{X \sim \mathbb{P}}[f(X)]-\mathbb{E}_{Y \sim \mathbb{Q}}[f(Y)]
+$$
+where $\mathcal{H}_k$ is the reproducing kernel Hilbert space with a kernel $k$.
+
+Wasserstein distance
+$$
+\mathcal{W}(\mathbb{P}, \mathbb{Q})=\sup _{f:\|f\|_{\text {Lip }} \leq 1} \mathbb{E}_{X \sim \mathbb{P}}[f(X)]-\mathbb{E}_{Y \sim \mathbb{Q}}[f(Y)]
+$$