Open
Description
sublayerout = layerNorm(x +sublayer(x))
首先是残差链接然后是层标准化
在你代码中:sublayer.py中 应该是
def forward(self, x, sublayer):
"Apply residual connection to any sublayer with the same size."
# return x + self.dropout(sublayer(self.norm(x)))
return self.norm( x + self.dropout(sublayer(x)))
tranformer.py中:
def forward(self, x, mask):
x = self.input_sublayer(x, lambda _x: self.attention.forward(_x, _x, _x, mask=mask))
x = self.output_sublayer(x, lambda _x: self.feed_forward.forward(_x))
return self.dropout(x)
此处我对论文立即额和你不一样,有错误的地方请指教
Metadata
Metadata
Assignees
Labels
No labels