Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Improving Mandarin Prosodic Structure Prediction with Multi-level Contextual Information

Aug 31, 2023

Jie Chen, Changhe Song, Deyi Tuo, Xixin Wu, Shiyin Kang, Zhiyong Wu, Helen Meng

Figure 1 for Improving Mandarin Prosodic Structure Prediction with Multi-level Contextual Information

Figure 2 for Improving Mandarin Prosodic Structure Prediction with Multi-level Contextual Information

Figure 3 for Improving Mandarin Prosodic Structure Prediction with Multi-level Contextual Information

Figure 4 for Improving Mandarin Prosodic Structure Prediction with Multi-level Contextual Information

Share this with someone who'll enjoy it:

Abstract:For text-to-speech (TTS) synthesis, prosodic structure prediction (PSP) plays an important role in producing natural and intelligible speech. Although inter-utterance linguistic information can influence the speech interpretation of the target utterance, previous works on PSP mainly focus on utilizing intrautterance linguistic information of the current utterance only. This work proposes to use inter-utterance linguistic information to improve the performance of PSP. Multi-level contextual information, which includes both inter-utterance and intrautterance linguistic information, is extracted by a hierarchical encoder from character level, utterance level and discourse level of the input text. Then a multi-task learning (MTL) decoder predicts prosodic boundaries from multi-level contextual information. Objective evaluation results on two datasets show that our method achieves better F1 scores in predicting prosodic word (PW), prosodic phrase (PPH) and intonational phrase (IPH). It demonstrates the effectiveness of using multi-level contextual information for PSP. Subjective preference tests also indicate the naturalness of synthesized speeches are improved.

* Accepted by Interspeech2022

View paper on

Share this with someone who'll enjoy it:

Title:Improving Mandarin Prosodic Structure Prediction with Multi-level Contextual Information

Paper and Code