Picture for Yin-Wen Chang

Yin-Wen Chang

Leveraging redundancy in attention with Reuse Transformers

Add code
Oct 13, 2021
Figure 1 for Leveraging redundancy in attention with Reuse Transformers
Figure 2 for Leveraging redundancy in attention with Reuse Transformers
Figure 3 for Leveraging redundancy in attention with Reuse Transformers
Figure 4 for Leveraging redundancy in attention with Reuse Transformers
Viaarxiv icon

Demystifying the Better Performance of Position Encoding Variants for Transformer

Add code
Apr 18, 2021
Figure 1 for Demystifying the Better Performance of Position Encoding Variants for Transformer
Figure 2 for Demystifying the Better Performance of Position Encoding Variants for Transformer
Figure 3 for Demystifying the Better Performance of Position Encoding Variants for Transformer
Figure 4 for Demystifying the Better Performance of Position Encoding Variants for Transformer
Viaarxiv icon

$O(n)$ Connections are Expressive Enough: Universal Approximability of Sparse Transformers

Add code
Jun 08, 2020
Figure 1 for $O(n)$ Connections are Expressive Enough: Universal Approximability of Sparse Transformers
Figure 2 for $O(n)$ Connections are Expressive Enough: Universal Approximability of Sparse Transformers
Viaarxiv icon

Pre-training Tasks for Embedding-based Large-scale Retrieval

Add code
Feb 10, 2020
Figure 1 for Pre-training Tasks for Embedding-based Large-scale Retrieval
Figure 2 for Pre-training Tasks for Embedding-based Large-scale Retrieval
Figure 3 for Pre-training Tasks for Embedding-based Large-scale Retrieval
Figure 4 for Pre-training Tasks for Embedding-based Large-scale Retrieval
Viaarxiv icon