Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Can Synthetic Audio From Generative Foundation Models Assist Audio Recognition and Speech Modeling?

Jun 13, 2024

Tiantian Feng, Dimitrios Dimitriadis, Shrikanth Narayanan

Figure 1 for Can Synthetic Audio From Generative Foundation Models Assist Audio Recognition and Speech Modeling?

Figure 2 for Can Synthetic Audio From Generative Foundation Models Assist Audio Recognition and Speech Modeling?

Figure 3 for Can Synthetic Audio From Generative Foundation Models Assist Audio Recognition and Speech Modeling?

Figure 4 for Can Synthetic Audio From Generative Foundation Models Assist Audio Recognition and Speech Modeling?

Share this with someone who'll enjoy it:

Abstract:Recent advances in foundation models have enabled audio-generative models that produce high-fidelity sounds associated with music, events, and human actions. Despite the success achieved in modern audio-generative models, the conventional approach to assessing the quality of the audio generation relies heavily on distance metrics like Frechet Audio Distance. In contrast, we aim to evaluate the quality of audio generation by examining the effectiveness of using them as training data. Specifically, we conduct studies to explore the use of synthetic audio for audio recognition. Moreover, we investigate whether synthetic audio can serve as a resource for data augmentation in speech-related modeling. Our comprehensive experiments demonstrate the potential of using synthetic audio for audio recognition and speech-related modeling. Our code is available at https://github.com/usc-sail/SynthAudio.

* Accepted to 2024 INTERSPEECH

View paper on

Share this with someone who'll enjoy it:

Title:Can Synthetic Audio From Generative Foundation Models Assist Audio Recognition and Speech Modeling?

Paper and Code