Weight initialization for neural nets
- Paper: Understanding the difficulty of training deep feedforward neural networks
- Blog: link
- Summary: Weight can be initialized by Gaussion distribution with 1/n variance and n is the number of input dimension. This can keep the output and intput have the same variance.
- Paper: Han, Song, et al. "Learning both weights and connections for efficient neural network." Advances in Neural Information Processing Systems. 2015.
- Comment: large weights play a more important role than smaller ones