Rethinking Batch Normalization in Transformers

Add code
Mar 17, 2020
Figure 1 for Rethinking Batch Normalization in Transformers
Figure 2 for Rethinking Batch Normalization in Transformers
Figure 3 for Rethinking Batch Normalization in Transformers
Figure 4 for Rethinking Batch Normalization in Transformers

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: