Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Towards Zero-Shot Text-To-Speech for Arabic Dialects

Jun 25, 2024

Khai Duy Doan, Abdul Waheed, Muhammad Abdul-Mageed

Figure 1 for Towards Zero-Shot Text-To-Speech for Arabic Dialects

Figure 2 for Towards Zero-Shot Text-To-Speech for Arabic Dialects

Figure 3 for Towards Zero-Shot Text-To-Speech for Arabic Dialects

Figure 4 for Towards Zero-Shot Text-To-Speech for Arabic Dialects

Share this with someone who'll enjoy it:

Abstract:Zero-shot multi-speaker text-to-speech (ZS-TTS) systems have advanced for English, however, it still lags behind due to insufficient resources. We address this gap for Arabic, a language of more than 450 million native speakers, by first adapting a sizeable existing dataset to suit the needs of speech synthesis. Additionally, we employ a set of Arabic dialect identification models to explore the impact of pre-defined dialect labels on improving the ZS-TTS model in a multi-dialect setting. Subsequently, we fine-tune the XTTS\footnote{https://docs.coqui.ai/en/latest/models/xtts.html}\footnote{https://medium.com/machine-learns/xtts-v2-new-version-of-the-open-source-text-to-speech-model-af73914db81f}\footnote{https://medium.com/@erogol/xtts-v1-techincal-notes-eb83ff05bdc} model, an open-source architecture. We then evaluate our models on a dataset comprising 31 unseen speakers and an in-house dialectal dataset. Our automated and human evaluation results show convincing performance while capable of generating dialectal speech. Our study highlights significant potential for improvements in this emerging area of research in Arabic.

View paper on

Share this with someone who'll enjoy it:

Title:Towards Zero-Shot Text-To-Speech for Arabic Dialects

Paper and Code