Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:HyperMixer: An MLP-based Green AI Alternative to Transformers

Mar 07, 2022

Florian Mai, Arnaud Pannatier, Fabio Fehr, Haolin Chen, Francois Marelli, Francois Fleuret, James Henderson

Figure 1 for HyperMixer: An MLP-based Green AI Alternative to Transformers

Figure 2 for HyperMixer: An MLP-based Green AI Alternative to Transformers

Figure 3 for HyperMixer: An MLP-based Green AI Alternative to Transformers

Figure 4 for HyperMixer: An MLP-based Green AI Alternative to Transformers

Share this with someone who'll enjoy it:

Abstract:Transformer-based architectures are the model of choice for natural language understanding, but they come at a significant cost, as they have quadratic complexity in the input length and can be difficult to tune. In the pursuit of Green AI, we investigate simple MLP-based architectures. We find that existing architectures such as MLPMixer, which achieves token mixing through a static MLP applied to each feature independently, are too detached from the inductive biases required for natural language understanding. In this paper, we propose a simple variant, HyperMixer, which forms the token mixing MLP dynamically using hypernetworks. Empirically, we demonstrate that our model performs better than alternative MLP-based models, and on par with Transformers. In contrast to Transformers, HyperMixer achieves these results at substantially lower costs in terms of processing time, training data, and hyperparameter tuning.

View paper on

Share this with someone who'll enjoy it:

Title:HyperMixer: An MLP-based Green AI Alternative to Transformers

Paper and Code