Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Enhancing Low-Resource ASR through Versatile TTS: Bridging the Data Gap

Oct 22, 2024

Guanrou Yang, Fan Yu, Ziyang Ma, Zhihao Du, Zhifu Gao, Shiliang Zhang, Xie Chen

Figure 1 for Enhancing Low-Resource ASR through Versatile TTS: Bridging the Data Gap

Figure 2 for Enhancing Low-Resource ASR through Versatile TTS: Bridging the Data Gap

Figure 3 for Enhancing Low-Resource ASR through Versatile TTS: Bridging the Data Gap

Figure 4 for Enhancing Low-Resource ASR through Versatile TTS: Bridging the Data Gap

Share this with someone who'll enjoy it:

Abstract:While automatic speech recognition (ASR) systems have achieved remarkable performance with large-scale datasets, their efficacy remains inadequate in low-resource settings, encompassing dialects, accents, minority languages, and long-tail hotwords, domains with significant practical relevance. With the advent of versatile and powerful text-to-speech (TTS) models, capable of generating speech with human-level naturalness, expressiveness, and diverse speaker profiles, leveraging TTS for ASR data augmentation provides a cost-effective and practical approach to enhancing ASR performance. Comprehensive experiments on an unprecedentedly rich variety of low-resource datasets demonstrate consistent and substantial performance improvements, proving that the proposed method of enhancing low-resource ASR through a versatile TTS model is highly effective and has broad application prospects. Furthermore, we delve deeper into key characteristics of synthesized speech data that contribute to ASR improvement, examining factors such as text diversity, speaker diversity, and the volume of synthesized data, with text diversity being studied for the first time in this work. We hope our findings provide helpful guidance and reference for the practical application of TTS-based data augmentation and push the advancement of low-resource ASR one step further.

View paper on

Share this with someone who'll enjoy it:

Title:Enhancing Low-Resource ASR through Versatile TTS: Bridging the Data Gap

Paper and Code