Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:YOLO-Stutter: End-to-end Region-Wise Speech Dysfluency Detection

Sep 09, 2024

Xuanru Zhou, Anshul Kashyap, Steve Li, Ayati Sharma, Brittany Morin, David Baquirin, Jet Vonk, Zoe Ezzes, Zachary Miller, Maria Luisa Gorno Tempini(+2 more)

Figure 1 for YOLO-Stutter: End-to-end Region-Wise Speech Dysfluency Detection

Figure 2 for YOLO-Stutter: End-to-end Region-Wise Speech Dysfluency Detection

Figure 3 for YOLO-Stutter: End-to-end Region-Wise Speech Dysfluency Detection

Figure 4 for YOLO-Stutter: End-to-end Region-Wise Speech Dysfluency Detection

Share this with someone who'll enjoy it:

Abstract:Dysfluent speech detection is the bottleneck for disordered speech analysis and spoken language learning. Current state-of-the-art models are governed by rule-based systems which lack efficiency and robustness, and are sensitive to template design. In this paper, we propose YOLO-Stutter: a first end-to-end method that detects dysfluencies in a time-accurate manner. YOLO-Stutter takes imperfect speech-text alignment as input, followed by a spatial feature aggregator, and a temporal dependency extractor to perform region-wise boundary and class predictions. We also introduce two dysfluency corpus, VCTK-Stutter and VCTK-TTS, that simulate natural spoken dysfluencies including repetition, block, missing, replacement, and prolongation. Our end-to-end method achieves state-of-the-art performance with a minimum number of trainable parameters for on both simulated data and real aphasia speech. Code and datasets are open-sourced at https://github.com/rorizzz/YOLO-Stutter

* Interspeech 2024

View paper on

Share this with someone who'll enjoy it:

Title:YOLO-Stutter: End-to-end Region-Wise Speech Dysfluency Detection

Paper and Code