Skip to content

PyTorch implementation of Capsules Network, an idea from the NIPS 2017 paper Dynamic Routing Between Capsules.

License

Notifications You must be signed in to change notification settings

ecstayalive/capsule_nn

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

capsule_nn

PyTorch implementation of Capsules Network, an idea from the NIPS 2017 paper Dynamic Routing Between Capsules.

The calculation process

Here we only discuss the calculation process between the primary capsule layer and dense capsule layer.

For example, we get $10$ feature vectors which these vectors' dimensions are $16$. Thus, the shape of this tensor is $(1, 10, 8)$. The front of the shape array is $1$, because we assume that there is only one data input to the network. At the same time, we assume the output includes $5$ feature vectors which these vector's dimensions are $4$. This means that the shape of the output tensor is $(1, 5, 4)$

Then we need to affine transform the data. For one feature vector, we already the vector's shape is $(8, 1)$. The first step is use a matrix which shape is $(4, 8)$ to change the dimensions of the input feature vector's shape. After implementing the multiplication, the shape of the input feature vector is $(4, 1)$. However, we have $10$ input feature vectors, thus the affine matrix's shape should be $(10, 4, 8)$. And to implement the matrix multiplication correctly, we should change the input feature matrix's shape from $(1, 10, 8)$ to $(1, 10, 8, 1)$, which means we use the Use a column vector to represent each eigenvector.

Now we have ten features, which come from the input feature matrix, to derive a category. But we have five categories, thus we need to make $5$ times affine transform. And for this reason, the affine matrix's real shape is $(5, 10, 4, 8)$. Because we use pytorch to do this matrix multiplication, thus we need to change the input feature matrix to $(1, 1, 10, 8, 1)$. Then, pytorch will broadcast the input matrix to achieve the matrix multiplication, and we will get a matrix with shape $(1, 5, 10, 4, 1)$.

In order to implement the dynamic routing algorithm, we will initialize a matrix with shape $(1, 5, 10)$, and perform softmax operation along the dimension with index $1$ to obtain an assignment matrix. Next use the dynamic routing algorithm which is introduced in the paper to update the assignment matrix.

About

PyTorch implementation of Capsules Network, an idea from the NIPS 2017 paper Dynamic Routing Between Capsules.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages