Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:FMMformer: Efficient and Flexible Transformer via Decomposed Near-field and Far-field Attention

Aug 05, 2021

Tan M. Nguyen, Vai Suliafu, Stanley J. Osher, Long Chen, Bao Wang

Figure 1 for FMMformer: Efficient and Flexible Transformer via Decomposed Near-field and Far-field Attention

Figure 2 for FMMformer: Efficient and Flexible Transformer via Decomposed Near-field and Far-field Attention

Figure 3 for FMMformer: Efficient and Flexible Transformer via Decomposed Near-field and Far-field Attention

Figure 4 for FMMformer: Efficient and Flexible Transformer via Decomposed Near-field and Far-field Attention

Share this with someone who'll enjoy it:

Abstract:We propose FMMformers, a class of efficient and flexible transformers inspired by the celebrated fast multipole method (FMM) for accelerating interacting particle simulation. FMM decomposes particle-particle interaction into near-field and far-field components and then performs direct and coarse-grained computation, respectively. Similarly, FMMformers decompose the attention into near-field and far-field attention, modeling the near-field attention by a banded matrix and the far-field attention by a low-rank matrix. Computing the attention matrix for FMMformers requires linear complexity in computational time and memory footprint with respect to the sequence length. In contrast, standard transformers suffer from quadratic complexity. We analyze and validate the advantage of FMMformers over the standard transformer on the Long Range Arena and language modeling benchmarks. FMMformers can even outperform the standard transformer in terms of accuracy by a significant margin. For instance, FMMformers achieve an average classification accuracy of $60.74\%$ over the five Long Range Arena tasks, which is significantly better than the standard transformer's average accuracy of $58.70\%$.

* 18 pages, 8 figures

View paper on

OpenReview

Share this with someone who'll enjoy it:

Title:FMMformer: Efficient and Flexible Transformer via Decomposed Near-field and Far-field Attention

Paper and Code