Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Prompt-Singer: Controllable Singing-Voice-Synthesis with Natural Language Prompt

Mar 18, 2024

Yongqi Wang, Ruofan Hu, Rongjie Huang, Zhiqing Hong, Ruiqi Li, Wenrui Liu, Fuming You, Tao Jin, Zhou Zhao

Figure 1 for Prompt-Singer: Controllable Singing-Voice-Synthesis with Natural Language Prompt

Figure 2 for Prompt-Singer: Controllable Singing-Voice-Synthesis with Natural Language Prompt

Figure 3 for Prompt-Singer: Controllable Singing-Voice-Synthesis with Natural Language Prompt

Figure 4 for Prompt-Singer: Controllable Singing-Voice-Synthesis with Natural Language Prompt

Share this with someone who'll enjoy it:

Abstract:Recent singing-voice-synthesis (SVS) methods have achieved remarkable audio quality and naturalness, yet they lack the capability to control the style attributes of the synthesized singing explicitly. We propose Prompt-Singer, the first SVS method that enables attribute controlling on singer gender, vocal range and volume with natural language. We adopt a model architecture based on a decoder-only transformer with a multi-scale hierarchy, and design a range-melody decoupled pitch representation that enables text-conditioned vocal range control while keeping melodic accuracy. Furthermore, we explore various experiment settings, including different types of text representations, text encoder fine-tuning, and introducing speech data to alleviate data scarcity, aiming to facilitate further research. Experiments show that our model achieves favorable controlling ability and audio quality. Audio samples are available at http://prompt-singer.github.io .

* Accepted by NAACL 2024 (main conference)

View paper on

Share this with someone who'll enjoy it:

Title:Prompt-Singer: Controllable Singing-Voice-Synthesis with Natural Language Prompt

Paper and Code