Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Phoneme Hallucinator: One-shot Voice Conversion via Set Expansion

Aug 11, 2023

Siyuan Shan, Yang Li, Amartya Banerjee, Junier B. Oliva

Figure 1 for Phoneme Hallucinator: One-shot Voice Conversion via Set Expansion

Figure 2 for Phoneme Hallucinator: One-shot Voice Conversion via Set Expansion

Figure 3 for Phoneme Hallucinator: One-shot Voice Conversion via Set Expansion

Figure 4 for Phoneme Hallucinator: One-shot Voice Conversion via Set Expansion

Share this with someone who'll enjoy it:

Abstract:Voice conversion (VC) aims at altering a person's voice to make it sound similar to the voice of another person while preserving linguistic content. Existing methods suffer from a dilemma between content intelligibility and speaker similarity; i.e., methods with higher intelligibility usually have a lower speaker similarity, while methods with higher speaker similarity usually require plenty of target speaker voice data to achieve high intelligibility. In this work, we propose a novel method \textit{Phoneme Hallucinator} that achieves the best of both worlds. Phoneme Hallucinator is a one-shot VC model; it adopts a novel model to hallucinate diversified and high-fidelity target speaker phonemes based just on a short target speaker voice (e.g. 3 seconds). The hallucinated phonemes are then exploited to perform neighbor-based voice conversion. Our model is a text-free, any-to-any VC model that requires no text annotations and supports conversion to any unseen speaker. Objective and subjective evaluations show that \textit{Phoneme Hallucinator} outperforms existing VC methods for both intelligibility and speaker similarity.

* under review

View paper on

Share this with someone who'll enjoy it:

Title:Phoneme Hallucinator: One-shot Voice Conversion via Set Expansion

Paper and Code