Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Filler Word Detection and Classification: A Dataset and Benchmark

Mar 28, 2022

Ge Zhu, Juan-Pablo Caceres, Justin Salamon

Figure 1 for Filler Word Detection and Classification: A Dataset and Benchmark

Figure 2 for Filler Word Detection and Classification: A Dataset and Benchmark

Figure 3 for Filler Word Detection and Classification: A Dataset and Benchmark

Figure 4 for Filler Word Detection and Classification: A Dataset and Benchmark

Share this with someone who'll enjoy it:

Abstract:Filler words such as `uh' or `um' are sounds or words people use to signal they are pausing to think. Finding and removing filler words from recordings is a common and tedious task in media editing. Automatically detecting and classifying filler words could greatly aid in this task, but few studies have been published on this problem. A key reason is the absence of a dataset with annotated filler words for training and evaluation. In this work, we present a novel speech dataset, PodcastFillers, with 35K annotated filler words and 50K annotations of other sounds that commonly occur in podcasts such as breaths, laughter, and word repetitions. We propose a pipeline that leverages VAD and ASR to detect filler candidates and a classifier to distinguish between filler word types. We evaluate our proposed pipeline on PodcastFillers, compare to several baselines, and present a detailed ablation study. In particular, we evaluate the importance of using ASR and how it compares to a transcription-free approach resembling keyword spotting. We show that our pipeline obtains state-of-the-art results, and that leveraging ASR strongly outperforms a keyword spotting approach. We make PodcastFillers publicly available, and hope our work serves as a benchmark for future research.

* Submitted to Insterspeech 2022

View paper on

Share this with someone who'll enjoy it:

Title:Filler Word Detection and Classification: A Dataset and Benchmark

Paper and Code