Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:ConSinger: Efficient High-Fidelity Singing Voice Generation with Minimal Steps

Oct 20, 2024

Yulin Song, Guorui Sang, Jing Yu, Chuangbai Xiao

Share this with someone who'll enjoy it:

Abstract:Singing voice synthesis (SVS) system is expected to generate high-fidelity singing voice from given music scores (lyrics, duration and pitch). Recently, diffusion models have performed well in this field. However, sacrificing inference speed to exchange with high-quality sample generation limits its application scenarios. In order to obtain high quality synthetic singing voice more efficiently, we propose a singing voice synthesis method based on the consistency model, ConSinger, to achieve high-fidelity singing voice synthesis with minimal steps. The model is trained by applying consistency constraint and the generation quality is greatly improved at the expense of a small amount of inference speed. Our experiments show that ConSinger is highly competitive with the baseline model in terms of generation speed and quality. Audio samples are available at https://keylxiao.github.io/consinger.

* Singing voice synthesis, Consistency models, diffusion models

View paper on

Share this with someone who'll enjoy it:

Title:ConSinger: Efficient High-Fidelity Singing Voice Generation with Minimal Steps

Paper and Code