Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:A Comparative Analysis Of Latent Regressor Losses For Singing Voice Conversion

Feb 27, 2023

Brendan O'Connor, Simon Dixon

Figure 1 for A Comparative Analysis Of Latent Regressor Losses For Singing Voice Conversion

Figure 2 for A Comparative Analysis Of Latent Regressor Losses For Singing Voice Conversion

Figure 3 for A Comparative Analysis Of Latent Regressor Losses For Singing Voice Conversion

Share this with someone who'll enjoy it:

Abstract:Previous research has shown that established techniques for spoken voice conversion (VC) do not perform as well when applied to singing voice conversion (SVC). We propose an alternative loss component in a loss function that is otherwise well-established among VC tasks, which has been shown to improve our model's SVC performance. We first trained a singer identity embedding (SIE) network on mel-spectrograms of singer recordings to produce singer-specific variance encodings using contrastive learning. We subsequently trained a well-known autoencoder framework (AutoVC) conditioned on these SIEs, and measured differences in SVC performance when using different latent regressor loss components. We found that using this loss w.r.t. SIEs leads to better performance than w.r.t. bottleneck embeddings, where converted audio is more natural and specific towards target singers. The inclusion of this loss component has the advantage of explicitly forcing the network to reconstruct with timbral similarity, and also negates the effect of poor disentanglement in AutoVC's bottleneck embeddings. We demonstrate peculiar diversity between computational and human evaluations on singer-converted audio clips, which highlights the necessity of both. We also propose a pitch-matching mechanism between source and target singers to ensure these evaluations are not influenced by differences in pitch register.

* Submitted to the Sound and Music Computing Conference 2023

View paper on

Share this with someone who'll enjoy it:

Title:A Comparative Analysis Of Latent Regressor Losses For Singing Voice Conversion

Paper and Code