Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Low Rank Factorization for Compact Multi-Head Self-Attention

Nov 26, 2019

Sneha Mehta, Huzefa Rangwala, Naren Ramakrishnan

Figure 1 for Low Rank Factorization for Compact Multi-Head Self-Attention

Figure 2 for Low Rank Factorization for Compact Multi-Head Self-Attention

Figure 3 for Low Rank Factorization for Compact Multi-Head Self-Attention

Figure 4 for Low Rank Factorization for Compact Multi-Head Self-Attention

Share this with someone who'll enjoy it:

Abstract:Effective representation learning from text has been an active area of research in the fields of NLP and text mining. Attention mechanisms have been at the forefront in order to learn contextual sentence representations. Current state-of-art approaches in representation learning use single-head and multi-head attention mechanisms to learn context-aware representations. However, these approaches can be largely parameter intensive resulting in low-resource bottlenecks. In this work we present a novel multi-head attention mechanism that uses low-rank bilinear pooling to efficiently construct a structured sentence representation that attends to multiple aspects of a sentence. We show that the proposed model is more effeffective than single-head attention mechanisms and is also more parameter efficient and faster to compute than existing multi-head approaches. We evaluate the performance of the proposed model on multiple datasets on two text classification benchmarks including: (i) Sentiment Analysis and (ii) News classification.

* 9 pages, 5 figures

View paper on

Share this with someone who'll enjoy it:

Title:Low Rank Factorization for Compact Multi-Head Self-Attention

Paper and Code