Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing

Jun 05, 2020

Zihang Dai, Guokun Lai, Yiming Yang, Quoc V. Le

Figure 1 for Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing

Figure 2 for Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing

Figure 3 for Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing

Figure 4 for Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing

Share this with someone who'll enjoy it:

Abstract:With the success of language pretraining, it is highly desirable to develop more efficient architectures of good scalability that can exploit the abundant unlabeled data at a lower cost. To improve the efficiency, we examine the much-overlooked redundancy in maintaining a full-length token-level presentation, especially for tasks that only require a single-vector presentation of the sequence. With this intuition, we propose Funnel-Transformer which gradually compresses the sequence of hidden states to a shorter one and hence reduces the computation cost. More importantly, by re-investing the saved FLOPs from length reduction in constructing a deeper or wider model, we further improve the model capacity. In addition, to perform token-level predictions as required by common pretraining objectives, Funnel-Transformer is able to recover a deep representation for each token from the reduced hidden sequence via a decoder. Empirically, with comparable or fewer FLOPs, Funnel-Transformer outperforms the standard Transformer on a wide variety of sequence-level prediction tasks, including text classification, language understanding, and reading comprehension. The code and pretrained checkpoints are available at https://github.com/laiguokun/Funnel-Transformer.

View paper on

Share this with someone who'll enjoy it:

Title:Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing

Paper and Code