Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Qisheng Li

AS-70: A Mandarin stuttered speech dataset for automatic speech recognition and stuttering event detection

Jun 11, 2024

Rong Gong, Hongfei Xue, Lezhi Wang, Xin Xu, Qisheng Li, Lei Xie, Hui Bu, Shaomei Wu, Jiaming Zhou, Yong Qin(+4 more)

Figure 1 for AS-70: A Mandarin stuttered speech dataset for automatic speech recognition and stuttering event detection

Figure 2 for AS-70: A Mandarin stuttered speech dataset for automatic speech recognition and stuttering event detection

Figure 3 for AS-70: A Mandarin stuttered speech dataset for automatic speech recognition and stuttering event detection

Abstract:The rapid advancements in speech technologies over the past two decades have led to human-level performance in tasks like automatic speech recognition (ASR) for fluent speech. However, the efficacy of these models diminishes when applied to atypical speech, such as stuttering. This paper introduces AS-70, the first publicly available Mandarin stuttered speech dataset, which stands out as the largest dataset in its category. Encompassing conversational and voice command reading speech, AS-70 includes verbatim manual transcription, rendering it suitable for various speech-related tasks. Furthermore, baseline systems are established, and experimental results are presented for ASR and stuttering event detection (SED) tasks. By incorporating this dataset into the model fine-tuning, significant improvements in the state-of-the-art ASR models, e.g., Whisper and Hubert, are observed, enhancing their inclusivity in addressing stuttered speech.

* Accepted by Interspeech 2024

Via

Access Paper or Ask Questions

Assessing ASR Model Quality on Disordered Speech using BERTScore

Sep 21, 2022

Jimmy Tobin, Qisheng Li, Subhashini Venugopalan, Katie Seaver, Richard Cave, Katrin Tomanek

Figure 1 for Assessing ASR Model Quality on Disordered Speech using BERTScore

Figure 2 for Assessing ASR Model Quality on Disordered Speech using BERTScore

Figure 3 for Assessing ASR Model Quality on Disordered Speech using BERTScore

Figure 4 for Assessing ASR Model Quality on Disordered Speech using BERTScore

Abstract:Word Error Rate (WER) is the primary metric used to assess automatic speech recognition (ASR) model quality. It has been shown that ASR models tend to have much higher WER on speakers with speech impairments than typical English speakers. It is hard to determine if models can be be useful at such high error rates. This study investigates the use of BERTScore, an evaluation metric for text generation, to provide a more informative measure of ASR model quality and usefulness. Both BERTScore and WER were compared to prediction errors manually annotated by Speech Language Pathologists for error type and assessment. BERTScore was found to be more correlated with human assessment of error type and assessment. BERTScore was specifically more robust to orthographic changes (contraction and normalization errors) where meaning was preserved. Furthermore, BERTScore was a better fit of error assessment than WER, as measured using an ordinal logistic regression and the Akaike's Information Criterion (AIC). Overall, our findings suggest that BERTScore can complement WER when assessing ASR model performance from a practical perspective, especially for accessibility applications where models are useful even at lower accuracy than for typical speech.

* Accepted to Interspeech 2022 Workshop on Speech for Social Good

Via

Access Paper or Ask Questions

The Effect of Moderation on Online Mental Health Conversations

Jun 10, 2020

David Wadden, Tal August, Qisheng Li, Tim Althoff

Figure 1 for The Effect of Moderation on Online Mental Health Conversations

Figure 2 for The Effect of Moderation on Online Mental Health Conversations

Figure 3 for The Effect of Moderation on Online Mental Health Conversations

Figure 4 for The Effect of Moderation on Online Mental Health Conversations

Abstract:Many people struggling with mental health issues are unable to access adequate care due to high costs and a shortage of mental health professionals, leading to a global mental health crisis. Online mental health communities can help mitigate this crisis by offering a scalable, easily accessible alternative to in-person sessions with therapists or support groups. However, people seeking emotional or psychological support online may be especially vulnerable to the kinds of antisocial behavior that sometimes occur in online discussions. Moderation can improve online discourse quality, but we lack an understanding of its effects on online mental health conversations. In this work, we leveraged a natural experiment, occurring across 200,000 messages from 7,000 conversations hosted on a mental health mobile application, to evaluate the effects of moderation on online mental health discussions. We found that participation in group mental health discussions led to improvements in psychological perspective, and that these improvements were larger in moderated conversations. The presence of a moderator increased user engagement, encouraged users to discuss negative emotions more candidly, and dramatically reduced bad behavior among chat participants. Moderation also encouraged stronger linguistic coordination, which is indicative of trust building. In addition, moderators who remained active in conversations were especially successful in keeping conversations on topic. Our findings suggest that moderation can serve as a valuable tool to improve the efficacy and safety of online mental health conversations. Based on these findings, we discuss implications and trade-offs involved in designing effective online spaces for mental health support.

* 13 pages, 12 figures. 3 tables

Via

Access Paper or Ask Questions