Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Generative Pre-training for Speech with Flow Matching

Oct 25, 2023

Alexander H. Liu, Matt Le, Apoorv Vyas, Bowen Shi, Andros Tjandra, Wei-Ning Hsu

Figure 1 for Generative Pre-training for Speech with Flow Matching

Figure 2 for Generative Pre-training for Speech with Flow Matching

Figure 3 for Generative Pre-training for Speech with Flow Matching

Figure 4 for Generative Pre-training for Speech with Flow Matching

Share this with someone who'll enjoy it:

Abstract:Generative models have gained more and more attention in recent years for their remarkable success in tasks that required estimating and sampling data distribution to generate high-fidelity synthetic data. In speech, text-to-speech synthesis and neural vocoder are good examples where generative models have shined. While generative models have been applied to different applications in speech, there exists no general-purpose generative model that models speech directly. In this work, we take a step toward this direction by showing a single pre-trained generative model can be adapted to different downstream tasks with strong performance. Specifically, we pre-trained a generative model, named SpeechFlow, on 60k hours of untranscribed speech with Flow Matching and masked conditions. Experiment results show the pre-trained generative model can be fine-tuned with task-specific data to match or surpass existing expert models on speech enhancement, separation, and synthesis. Our work suggested a foundational model for generation tasks in speech can be built with generative pre-training.

* Preprint, under review

View paper on

Share this with someone who'll enjoy it:

Title:Generative Pre-training for Speech with Flow Matching

Paper and Code