Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Enhancing Dysarthric Speech Recognition for Unseen Speakers via Prototype-Based Adaptation

Jul 26, 2024

Shiyao Wang, Shiwan Zhao, Jiaming Zhou, Aobo Kong, Yong Qin

Figure 1 for Enhancing Dysarthric Speech Recognition for Unseen Speakers via Prototype-Based Adaptation

Figure 2 for Enhancing Dysarthric Speech Recognition for Unseen Speakers via Prototype-Based Adaptation

Figure 3 for Enhancing Dysarthric Speech Recognition for Unseen Speakers via Prototype-Based Adaptation

Figure 4 for Enhancing Dysarthric Speech Recognition for Unseen Speakers via Prototype-Based Adaptation

Share this with someone who'll enjoy it:

Abstract:Dysarthric speech recognition (DSR) presents a formidable challenge due to inherent inter-speaker variability, leading to severe performance degradation when applying DSR models to new dysarthric speakers. Traditional speaker adaptation methodologies typically involve fine-tuning models for each speaker, but this strategy is cost-prohibitive and inconvenient for disabled users, requiring substantial data collection. To address this issue, we introduce a prototype-based approach that markedly improves DSR performance for unseen dysarthric speakers without additional fine-tuning. Our method employs a feature extractor trained with HuBERT to produce per-word prototypes that encapsulate the characteristics of previously unseen speakers. These prototypes serve as the basis for classification. Additionally, we incorporate supervised contrastive learning to refine feature extraction. By enhancing representation quality, we further improve DSR performance, enabling effective personalized DSR. We release our code at https://github.com/NKU-HLT/PB-DSR.

* INTERSPEECH 2024 * accepted by Interspeech 2024

View paper on

Share this with someone who'll enjoy it:

Title:Enhancing Dysarthric Speech Recognition for Unseen Speakers via Prototype-Based Adaptation

Paper and Code