Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Carmen García-Mateo

Weighted Cross-entropy for Low-Resource Languages in Multilingual Speech Recognition

Sep 25, 2024

Andrés Piñeiro-Martín, Carmen García-Mateo, Laura Docío-Fernández, María del Carmen López-Pérez, Georg Rehm

Figure 1 for Weighted Cross-entropy for Low-Resource Languages in Multilingual Speech Recognition

Figure 2 for Weighted Cross-entropy for Low-Resource Languages in Multilingual Speech Recognition

Abstract:This paper addresses the challenge of integrating low-resource languages into multilingual automatic speech recognition (ASR) systems. We introduce a novel application of weighted cross-entropy, typically used for unbalanced datasets, to facilitate the integration of low-resource languages into pre-trained multilingual ASR models within the context of continual multilingual learning. We fine-tune the Whisper multilingual ASR model on five high-resource languages and one low-resource language, employing language-weighted dynamic cross-entropy and data augmentation. The results show a remarkable 6.69% word error rate (WER) reduction for the low-resource language compared to the fine-tuned model without applying our approach, and a 48.86% WER reduction compared to the original Whisper model. In addition, our approach yields an average WER reduction of 3.29% across the six languages, showing no degradation for the high-resource languages.

* Proceedings of Interspeech 2024
* 5 pages, 1 figure. Presented at Interspeech 2024

Via

Access Paper or Ask Questions