Skip to content

Commit

Permalink
Fix padding_idx logical error in Adaptive Input (facebookresearch#1629)
Browse files Browse the repository at this point in the history
Summary:
# Before submitting

- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [x] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/master/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?

## What does this PR do?

I think if we keep pass **padding index of vocabulary** as `padding_idx` to adaptive embedding layers,
there will be no chance to train some words.

e.g. If `cut_off` is (20000,60000) and vocab is larger than 60000,
we can't learn[**20,000+padding_idx**]th word and [**60,000+padding_idx**]th word.
Because those words' ids will be **padding_idx** by subtraction logic and eventually get zero tensors.

So, I changed `self.padding_idx` to `None` after assign vocab's `padding_idx`
**for the first time at head embedding representation**.
Pull Request resolved: facebookresearch#1629

Differential Revision: D19557340

Pulled By: myleott

fbshipit-source-id: e0c3b38862374d422a46dc62c248b2ecfbf08fd2
  • Loading branch information
Huffon authored and facebook-github-bot committed Jan 24, 2020
1 parent 0ff92b9 commit 4f71c63
Showing 1 changed file with 3 additions and 1 deletion.
4 changes: 3 additions & 1 deletion fairseq/modules/adaptive_input.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,10 +39,12 @@ def __init__(
size = self.cutoff[i] - prev
dim = int(initial_dim // (factor ** i))
seq = nn.Sequential(
nn.Embedding(size, dim, padding_idx),
nn.Embedding(size, dim, self.padding_idx),
nn.Linear(dim, output_dim, bias=False)
)
self.embeddings.append(seq)
self.padding_idx = None
self.padding_idx = padding_idx

def init_weights(m):
if isinstance(m, nn.Embedding):
Expand Down

0 comments on commit 4f71c63

Please sign in to comment.