Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:PodcastMix: A dataset for separating music and speech in podcasts

Jul 15, 2022

Nicolás Schmidt, Jordi Pons, Marius Miron

Figure 1 for PodcastMix: A dataset for separating music and speech in podcasts

Figure 2 for PodcastMix: A dataset for separating music and speech in podcasts

Figure 3 for PodcastMix: A dataset for separating music and speech in podcasts

Figure 4 for PodcastMix: A dataset for separating music and speech in podcasts

Share this with someone who'll enjoy it:

Abstract:We introduce PodcastMix, a dataset formalizing the task of separating background music and foreground speech in podcasts. We aim at defining a benchmark suitable for training and evaluating (deep learning) source separation models. To that end, we release a large and diverse training dataset based on programatically generated podcasts. However, current (deep learning) models can incur into generalization issues, specially when trained on synthetic data. To target potential generalization issues, we release an evaluation set based on real podcasts for which we design objective and subjective tests. Out of our experiments with real podcasts, we find that current (deep learning) models may have generalization issues. Yet, these can perform competently, e.g., our best baseline separates speech with a mean opinion score of 3.84 (rating "overall separation quality" from 1 to 5). The dataset and baselines are accessible online.

* In proceedings of INTERSPEECH2022. Project webpage: http://www.jordipons.me/apps/podcastmix/

View paper on

Share this with someone who'll enjoy it:

Title:PodcastMix: A dataset for separating music and speech in podcasts

Paper and Code