Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:MR-SVS: Singing Voice Synthesis with Multi-Reference Encoder

Jan 11, 2022

Shoutong Wang, Jinglin Liu, Yi Ren, Zhen Wang, Changliang Xu, Zhou Zhao

Figure 1 for MR-SVS: Singing Voice Synthesis with Multi-Reference Encoder

Figure 2 for MR-SVS: Singing Voice Synthesis with Multi-Reference Encoder

Figure 3 for MR-SVS: Singing Voice Synthesis with Multi-Reference Encoder

Figure 4 for MR-SVS: Singing Voice Synthesis with Multi-Reference Encoder

Share this with someone who'll enjoy it:

Abstract:Multi-speaker singing voice synthesis is to generate the singing voice sung by different speakers. To generalize to new speakers, previous zero-shot singing adaptation methods obtain the timbre of the target speaker with a fixed-size embedding from single reference audio. However, they face several challenges: 1) the fixed-size speaker embedding is not powerful enough to capture full details of the target timbre; 2) single reference audio does not contain sufficient timbre information of the target speaker; 3) the pitch inconsistency between different speakers also leads to a degradation in the generated voice. In this paper, we propose a new model called MR-SVS to tackle these problems. Specifically, we employ both a multi-reference encoder and a fixed-size encoder to encode the timbre of the target speaker from multiple reference audios. The Multi-reference encoder can capture more details and variations of the target timbre. Besides, we propose a well-designed pitch shift method to address the pitch inconsistency problem. Experiments indicate that our method outperforms the baseline method both in naturalness and similarity.

View paper on

Share this with someone who'll enjoy it:

Title:MR-SVS: Singing Voice Synthesis with Multi-Reference Encoder

Paper and Code