Adding Recurrence to Pretrained Transformers for Improved Efficiency and Context Size

Add code
Aug 16, 2020
Figure 1 for Adding Recurrence to Pretrained Transformers for Improved Efficiency and Context Size
Figure 2 for Adding Recurrence to Pretrained Transformers for Improved Efficiency and Context Size
Figure 3 for Adding Recurrence to Pretrained Transformers for Improved Efficiency and Context Size
Figure 4 for Adding Recurrence to Pretrained Transformers for Improved Efficiency and Context Size

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: