Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Discrete Unit based Masking for Improving Disentanglement in Voice Conversion

Sep 17, 2024

Philip H. Lee, Ismail Rasim Ulgen, Berrak Sisman

Figure 1 for Discrete Unit based Masking for Improving Disentanglement in Voice Conversion

Figure 2 for Discrete Unit based Masking for Improving Disentanglement in Voice Conversion

Figure 3 for Discrete Unit based Masking for Improving Disentanglement in Voice Conversion

Figure 4 for Discrete Unit based Masking for Improving Disentanglement in Voice Conversion

Share this with someone who'll enjoy it:

Abstract:Voice conversion (VC) aims to modify the speaker's identity while preserving the linguistic content. Commonly, VC methods use an encoder-decoder architecture, where disentangling the speaker's identity from linguistic information is crucial. However, the disentanglement approaches used in these methods are limited as the speaker features depend on the phonetic content of the utterance, compromising disentanglement. This dependency is amplified with attention-based methods. To address this, we introduce a novel masking mechanism in the input before speaker encoding, masking certain discrete speech units that correspond highly with phoneme classes. Our work aims to reduce the phonetic dependency of speaker features by restricting access to some phonetic information. Furthermore, since our approach is at the input level, it is applicable to any encoder-decoder based VC framework. Our approach improves disentanglement and conversion performance across multiple VC methods, showing significant effectiveness, particularly in attention-based method, with 44% relative improvement in objective intelligibility.

* Accepted to IEEE SLT 2024

View paper on

Share this with someone who'll enjoy it:

Title:Discrete Unit based Masking for Improving Disentanglement in Voice Conversion

Paper and Code