Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Streaming Bilingual End-to-End ASR model using Attention over Multiple Softmax

Jan 22, 2024

Aditya Patil, Vikas Joshi, Purvi Agrawal, Rupesh Mehta

Share this with someone who'll enjoy it:

Abstract:Even with several advancements in multilingual modeling, it is challenging to recognize multiple languages using a single neural model, without knowing the input language and most multilingual models assume the availability of the input language. In this work, we propose a novel bilingual end-to-end (E2E) modeling approach, where a single neural model can recognize both languages and also support switching between the languages, without any language input from the user. The proposed model has shared encoder and prediction networks, with language-specific joint networks that are combined via a self-attention mechanism. As the language-specific posteriors are combined, it produces a single posterior probability over all the output symbols, enabling a single beam search decoding and also allowing dynamic switching between the languages. The proposed approach outperforms the conventional bilingual baseline with 13.3%, 8.23% and 1.3% word error rate relative reduction on Hindi, English and code-mixed test sets, respectively.

* 2022 IEEE Spoken Language Technology Workshop (SLT), Doha, Qatar, 2023, pp. 252-259 * Published in IEEE's Spoken Language Technology (SLT) 2022, 8 pages (6 + 2 for references), 5 figures

View paper on

Share this with someone who'll enjoy it:

Title:Streaming Bilingual End-to-End ASR model using Attention over Multiple Softmax

Paper and Code