Picture for Yong Qin

Yong Qin

ChildMandarin: A Comprehensive Mandarin Speech Dataset for Young Children Aged 3-5

Add code
Sep 27, 2024
Figure 1 for ChildMandarin: A Comprehensive Mandarin Speech Dataset for Young Children Aged 3-5
Figure 2 for ChildMandarin: A Comprehensive Mandarin Speech Dataset for Young Children Aged 3-5
Figure 3 for ChildMandarin: A Comprehensive Mandarin Speech Dataset for Young Children Aged 3-5
Figure 4 for ChildMandarin: A Comprehensive Mandarin Speech Dataset for Young Children Aged 3-5
Viaarxiv icon

M2R-Whisper: Multi-stage and Multi-scale Retrieval Augmentation for Enhancing Whisper

Add code
Sep 18, 2024
Viaarxiv icon

Findings of the 2024 Mandarin Stuttering Event Detection and Automatic Speech Recognition Challenge

Add code
Sep 09, 2024
Figure 1 for Findings of the 2024 Mandarin Stuttering Event Detection and Automatic Speech Recognition Challenge
Figure 2 for Findings of the 2024 Mandarin Stuttering Event Detection and Automatic Speech Recognition Challenge
Figure 3 for Findings of the 2024 Mandarin Stuttering Event Detection and Automatic Speech Recognition Challenge
Figure 4 for Findings of the 2024 Mandarin Stuttering Event Detection and Automatic Speech Recognition Challenge
Viaarxiv icon

PB-LRDWWS System for the SLT 2024 Low-Resource Dysarthria Wake-Up Word Spotting Challenge

Add code
Sep 07, 2024
Viaarxiv icon

Uncertainty-Aware Mean Opinion Score Prediction

Add code
Aug 23, 2024
Viaarxiv icon

Iterative Prototype Refinement for Ambiguous Speech Emotion Recognition

Add code
Aug 01, 2024
Viaarxiv icon

Enhancing Dysarthric Speech Recognition for Unseen Speakers via Prototype-Based Adaptation

Add code
Jul 26, 2024
Figure 1 for Enhancing Dysarthric Speech Recognition for Unseen Speakers via Prototype-Based Adaptation
Figure 2 for Enhancing Dysarthric Speech Recognition for Unseen Speakers via Prototype-Based Adaptation
Figure 3 for Enhancing Dysarthric Speech Recognition for Unseen Speakers via Prototype-Based Adaptation
Figure 4 for Enhancing Dysarthric Speech Recognition for Unseen Speakers via Prototype-Based Adaptation
Viaarxiv icon

Enhancing Emotion Recognition in Incomplete Data: A Novel Cross-Modal Alignment, Reconstruction, and Refinement Framework

Add code
Jul 12, 2024
Figure 1 for Enhancing Emotion Recognition in Incomplete Data: A Novel Cross-Modal Alignment, Reconstruction, and Refinement Framework
Figure 2 for Enhancing Emotion Recognition in Incomplete Data: A Novel Cross-Modal Alignment, Reconstruction, and Refinement Framework
Figure 3 for Enhancing Emotion Recognition in Incomplete Data: A Novel Cross-Modal Alignment, Reconstruction, and Refinement Framework
Figure 4 for Enhancing Emotion Recognition in Incomplete Data: A Novel Cross-Modal Alignment, Reconstruction, and Refinement Framework
Viaarxiv icon

Self-Prompt Tuning: Enable Autonomous Role-Playing in LLMs

Add code
Jul 12, 2024
Viaarxiv icon

LAFMA: A Latent Flow Matching Model for Text-to-Audio Generation

Add code
Jun 12, 2024
Viaarxiv icon