Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Lezhi Wang

Findings of the 2024 Mandarin Stuttering Event Detection and Automatic Speech Recognition Challenge

Sep 09, 2024

Hongfei Xue, Rong Gong, Mingchen Shao, Xin Xu, Lezhi Wang, Lei Xie, Hui Bu, Jiaming Zhou, Yong Qin, Jun Du(+3 more)

Figure 1 for Findings of the 2024 Mandarin Stuttering Event Detection and Automatic Speech Recognition Challenge

Figure 2 for Findings of the 2024 Mandarin Stuttering Event Detection and Automatic Speech Recognition Challenge

Figure 3 for Findings of the 2024 Mandarin Stuttering Event Detection and Automatic Speech Recognition Challenge

Figure 4 for Findings of the 2024 Mandarin Stuttering Event Detection and Automatic Speech Recognition Challenge

Abstract:The StutteringSpeech Challenge focuses on advancing speech technologies for people who stutter, specifically targeting Stuttering Event Detection (SED) and Automatic Speech Recognition (ASR) in Mandarin. The challenge comprises three tracks: (1) SED, which aims to develop systems for detection of stuttering events; (2) ASR, which focuses on creating robust systems for recognizing stuttered speech; and (3) Research track for innovative approaches utilizing the provided dataset. We utilizes an open-source Mandarin stuttering dataset AS-70, which has been split into new training and test sets for the challenge. This paper presents the dataset, details the challenge tracks, and analyzes the performance of the top systems, highlighting improvements in detection accuracy and reductions in recognition error rates. Our findings underscore the potential of specialized models and augmentation strategies in developing stuttered speech technologies.

* 8 pages, 2 figures, accepted by SLT 2024

Via

Access Paper or Ask Questions

AS-70: A Mandarin stuttered speech dataset for automatic speech recognition and stuttering event detection

Jun 11, 2024

Rong Gong, Hongfei Xue, Lezhi Wang, Xin Xu, Qisheng Li, Lei Xie, Hui Bu, Shaomei Wu, Jiaming Zhou, Yong Qin(+4 more)

Figure 1 for AS-70: A Mandarin stuttered speech dataset for automatic speech recognition and stuttering event detection

Figure 2 for AS-70: A Mandarin stuttered speech dataset for automatic speech recognition and stuttering event detection

Figure 3 for AS-70: A Mandarin stuttered speech dataset for automatic speech recognition and stuttering event detection

Abstract:The rapid advancements in speech technologies over the past two decades have led to human-level performance in tasks like automatic speech recognition (ASR) for fluent speech. However, the efficacy of these models diminishes when applied to atypical speech, such as stuttering. This paper introduces AS-70, the first publicly available Mandarin stuttered speech dataset, which stands out as the largest dataset in its category. Encompassing conversational and voice command reading speech, AS-70 includes verbatim manual transcription, rendering it suitable for various speech-related tasks. Furthermore, baseline systems are established, and experimental results are presented for ASR and stuttering event detection (SED) tasks. By incorporating this dataset into the model fine-tuning, significant improvements in the state-of-the-art ASR models, e.g., Whisper and Hubert, are observed, enhancing their inclusivity in addressing stuttered speech.

* Accepted by Interspeech 2024

Via

Access Paper or Ask Questions