Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Multi-task Learning with Cross Attention for Keyword Spotting

Jul 15, 2021

Takuya Higuchi, Anmol Gupta, Chandra Dhir

Figure 1 for Multi-task Learning with Cross Attention for Keyword Spotting

Figure 2 for Multi-task Learning with Cross Attention for Keyword Spotting

Figure 3 for Multi-task Learning with Cross Attention for Keyword Spotting

Figure 4 for Multi-task Learning with Cross Attention for Keyword Spotting

Share this with someone who'll enjoy it:

Abstract:Keyword spotting (KWS) is an important technique for speech applications, which enables users to activate devices by speaking a keyword phrase. Although a phoneme classifier can be used for KWS, exploiting a large amount of transcribed data for automatic speech recognition (ASR), there is a mismatch between the training criterion (phoneme recognition) and the target task (KWS). Recently, multi-task learning has been applied to KWS to exploit both ASR and KWS training data. In this approach, an output of an acoustic model is split into two branches for the two tasks, one for phoneme transcription trained with the ASR data and one for keyword classification trained with the KWS data. In this paper, we introduce a cross attention decoder in the multi-task learning framework. Unlike the conventional multi-task learning approach with the simple split of the output layer, the cross attention decoder summarizes information from a phonetic encoder by performing cross attention between the encoder outputs and a trainable query sequence to predict a confidence score for the KWS task. Experimental results on KWS tasks show that the proposed approach outperformed the conventional multi-task learning with split branches and a bi-directional long short-team memory decoder by 12% on average.

* Submitted to ASRU 2021

View paper on

Share this with someone who'll enjoy it:

Title:Multi-task Learning with Cross Attention for Keyword Spotting

Paper and Code