Picture for David Ciprut

David Ciprut

Memory-efficient Transformers via Top-$k$ Attention

Add code
Jun 13, 2021
Figure 1 for Memory-efficient Transformers via Top-$k$ Attention
Figure 2 for Memory-efficient Transformers via Top-$k$ Attention
Figure 3 for Memory-efficient Transformers via Top-$k$ Attention
Figure 4 for Memory-efficient Transformers via Top-$k$ Attention
Viaarxiv icon