Skip to content

myazdani/formal-algorithms-for-transformers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

50 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Formal Algorithms For Transformers

PyTorch implementation of transformer algorithms described in "Formal Algorithms for Transformers" by Mary Phuong and Marcus Hutter: https://arxiv.org/abs/2207.09238

Algorithm 1: Token embedding

Algorithm 2: Positional embedding

Algorithm 3: Basic single-query attention

Algorithm 4: 𝑽˜ ← Attention(𝑿, 𝒁|W𝒒𝒌𝒗, Mask)

Algorithm 5: 𝑽˜ ← MHAttention(𝑿, 𝒁|W, Mask)

Algorithm 6: ˆ𝒆 ← layer_norm(𝒆|𝜸, 𝜷)

Algorithm 7: Unembedding.

Algorithm 8: 𝑷 ← EDTransformer(𝒛, 𝒙|𝜽)

Algorithm 9: 𝑷 ← ETransformer(𝒙|𝜽)

Algorithm 10: 𝑷 ← DTransformer(𝒙|𝜽)

Algorithm 11: 𝜽ˆ ← EDTraining(𝒛1:𝑁data , 𝒙1:𝑁data , 𝜽)

Algorithm 12: 𝜽ˆ ← ETraining(𝒙1:𝑁data , 𝜽)

Algorithm 13: 𝜽ˆ ← DTraining(𝒙1:𝑁data , 𝜽)

Algorithm 14: 𝒚 ← DInference(𝒙, 𝜽ˆ)

About

PyTorch implementation of algorithms in https://arxiv.org/abs/2207.09238

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published