Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:RobustSVC: HuBERT-based Melody Extractor and Adversarial Learning for Robust Singing Voice Conversion

Sep 10, 2024

Wei Chen, Xintao Zhao, Jun Chen, Binzhu Sha, Zhiwei Lin, Zhiyong Wu

Figure 1 for RobustSVC: HuBERT-based Melody Extractor and Adversarial Learning for Robust Singing Voice Conversion

Figure 2 for RobustSVC: HuBERT-based Melody Extractor and Adversarial Learning for Robust Singing Voice Conversion

Figure 3 for RobustSVC: HuBERT-based Melody Extractor and Adversarial Learning for Robust Singing Voice Conversion

Figure 4 for RobustSVC: HuBERT-based Melody Extractor and Adversarial Learning for Robust Singing Voice Conversion

Share this with someone who'll enjoy it:

Abstract:Singing voice conversion (SVC) is hindered by noise sensitivity due to the use of non-robust methods for extracting pitch and energy during the inference. As clean signals are key for the source audio in SVC, music source separation preprocessing offers a viable solution for handling noisy audio, like singing with background music (BGM). However, current separating methods struggle to fully remove noise or excessively suppress signal components, affecting the naturalness and similarity of the processed audio. To tackle this, our study introduces RobustSVC, a novel any-to-one SVC framework that converts noisy vocals into clean vocals sung by the target singer. We replace the non-robust feature with a HuBERT-based melody extractor and use adversarial training mechanisms with three discriminators to reduce information leakage in self-supervised representations. Experimental results show that RobustSVC is noise-robust and achieves higher similarity and naturalness than baseline methods in both noisy and clean vocal conditions.

* Accepted by ISCSLP 2024

View paper on

Share this with someone who'll enjoy it:

Title:RobustSVC: HuBERT-based Melody Extractor and Adversarial Learning for Robust Singing Voice Conversion

Paper and Code