Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Relaxed Attention: A Simple Method to Boost Performance of End-to-End Automatic Speech Recognition

Jul 02, 2021

Timo Lohrenz, Patrick Schwarz, Zhengyang Li, Tim Fingscheidt

Figure 1 for Relaxed Attention: A Simple Method to Boost Performance of End-to-End Automatic Speech Recognition

Figure 2 for Relaxed Attention: A Simple Method to Boost Performance of End-to-End Automatic Speech Recognition

Figure 3 for Relaxed Attention: A Simple Method to Boost Performance of End-to-End Automatic Speech Recognition

Figure 4 for Relaxed Attention: A Simple Method to Boost Performance of End-to-End Automatic Speech Recognition

Share this with someone who'll enjoy it:

Abstract:Recently, attention-based encoder-decoder (AED) models have shown high performance for end-to-end automatic speech recognition (ASR) across several tasks. Addressing overconfidence in such models, in this paper we introduce the concept of relaxed attention, which is a simple gradual injection of a uniform distribution to the encoder-decoder attention weights during training that is easily implemented with two lines of code. We investigate the effect of relaxed attention across different AED model architectures and two prominent ASR tasks, Wall Street Journal (WSJ) and Librispeech. We found that transformers trained with relaxed attention outperform the standard baseline models consistently during decoding with external language models. On WSJ, we set a new benchmark for transformer-based end-to-end speech recognition with a word error rate of 3.65%, outperforming state of the art (4.20%) by 13.1% relative, while introducing only a single hyperparameter. Upon acceptance, models will be published on github.

* submitted to ASRU 2021

View paper on

Share this with someone who'll enjoy it:

Title:Relaxed Attention: A Simple Method to Boost Performance of End-to-End Automatic Speech Recognition

Paper and Code