Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Accompanied Singing Voice Synthesis with Fully Text-controlled Melody

Jul 02, 2024

Ruiqi Li, Zhiqing Hong, Yongqi Wang, Lichao Zhang, Rongjie Huang, Siqi Zheng, Zhou Zhao

Figure 1 for Accompanied Singing Voice Synthesis with Fully Text-controlled Melody

Figure 2 for Accompanied Singing Voice Synthesis with Fully Text-controlled Melody

Figure 3 for Accompanied Singing Voice Synthesis with Fully Text-controlled Melody

Figure 4 for Accompanied Singing Voice Synthesis with Fully Text-controlled Melody

Share this with someone who'll enjoy it:

Abstract:Text-to-song (TTSong) is a music generation task that synthesizes accompanied singing voices. Current TTSong methods, inherited from singing voice synthesis (SVS), require melody-related information that can sometimes be impractical, such as music scores or MIDI sequences. We present MelodyLM, the first TTSong model that generates high-quality song pieces with fully text-controlled melodies, achieving minimal user requirements and maximum control flexibility. MelodyLM explicitly models MIDI as the intermediate melody-related feature and sequentially generates vocal tracks in a language model manner, conditioned on textual and vocal prompts. The accompaniment music is subsequently synthesized by a latent diffusion model with hybrid conditioning for temporal alignment. With minimal requirements, users only need to input lyrics and a reference voice to synthesize a song sample. For full control, just input textual prompts or even directly input MIDI. Experimental results indicate that MelodyLM achieves superior performance in terms of both objective and subjective metrics. Audio samples are available at https://melodylm666.github.io.

* Working in progress

View paper on

Share this with someone who'll enjoy it:

Title:Accompanied Singing Voice Synthesis with Fully Text-controlled Melody

Paper and Code