Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:ManaTTS Persian: a recipe for creating TTS datasets for lower resource languages

Sep 11, 2024

Mahta Fetrat Qharabagh, Zahra Dehghanian, Hamid R. Rabiee

Figure 1 for ManaTTS Persian: a recipe for creating TTS datasets for lower resource languages

Figure 2 for ManaTTS Persian: a recipe for creating TTS datasets for lower resource languages

Figure 3 for ManaTTS Persian: a recipe for creating TTS datasets for lower resource languages

Figure 4 for ManaTTS Persian: a recipe for creating TTS datasets for lower resource languages

Share this with someone who'll enjoy it:

Abstract:In this study, we introduce ManaTTS, the most extensive publicly accessible single-speaker Persian corpus, and a comprehensive framework for collecting transcribed speech datasets for the Persian language. ManaTTS, released under the open CC-0 license, comprises approximately 86 hours of audio with a sampling rate of 44.1 kHz. Alongside ManaTTS, we also generated the VirgoolInformal dataset to evaluate Persian speech recognition models used for forced alignment, extending over 5 hours of audio. The datasets are supported by a fully transparent, MIT-licensed pipeline, a testament to innovation in the field. It includes unique tools for sentence tokenization, bounded audio segmentation, and a novel forced alignment method. This alignment technique is specifically designed for low-resource languages, addressing a crucial need in the field. With this dataset, we trained a Tacotron2-based TTS model, achieving a Mean Opinion Score (MOS) of 3.76, which is remarkably close to the MOS of 3.86 for the utterances generated by the same vocoder and natural spectrogram, and the MOS of 4.01 for the natural waveform, demonstrating the exceptional quality and effectiveness of the corpus.

* 33 pages, 12 figures

View paper on

Share this with someone who'll enjoy it:

Title:ManaTTS Persian: a recipe for creating TTS datasets for lower resource languages

Paper and Code