fix the data dependent initializaiton. #37
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
See issue #36
The output of conv2d should be updated after g and b are updated in data dependent initialization.
The loss decreases faster when using the new data dependent initialization on cifar-10 dataset. I only tested the first few epochs for the time being. I will update the running log later.
new initialization:
Iteration 0, time = 3317s, train bits_per_dim = 4.1773, test bits_per_dim = 7.6512
Iteration 1, time = 3275s, train bits_per_dim = 3.7190, test bits_per_dim = 4.4990
original initialization:
Iteration 0, time = 3317s, train bits_per_dim = 4.1814, test bits_per_dim = 11.5115
Iteration 1, time = 3277s, train bits_per_dim = 3.6735, test bits_per_dim = 8.9926
Iteration 2, time = 3277s, train bits_per_dim = 3.5277, test bits_per_dim = 5.1907