Picture for Mingyu Cui

Mingyu Cui

Structured Speaker-Deficiency Adaptation of Foundation Models for Dysarthric and Elderly Speech Recognition

Add code
Dec 25, 2024
Viaarxiv icon

A Comparative Study of Discrete Speech Tokens for Semantic-Related Tasks with Large Language Models

Add code
Nov 13, 2024
Viaarxiv icon

Improving Grapheme-to-Phoneme Conversion through In-Context Knowledge Retrieval with Large Language Models

Add code
Nov 12, 2024
Viaarxiv icon

Exploring SSL Discrete Tokens for Multilingual ASR

Add code
Sep 13, 2024
Figure 1 for Exploring SSL Discrete Tokens for Multilingual ASR
Figure 2 for Exploring SSL Discrete Tokens for Multilingual ASR
Figure 3 for Exploring SSL Discrete Tokens for Multilingual ASR
Figure 4 for Exploring SSL Discrete Tokens for Multilingual ASR
Viaarxiv icon

Exploring SSL Discrete Speech Features for Zipformer-based Contextual ASR

Add code
Sep 13, 2024
Figure 1 for Exploring SSL Discrete Speech Features for Zipformer-based Contextual ASR
Figure 2 for Exploring SSL Discrete Speech Features for Zipformer-based Contextual ASR
Figure 3 for Exploring SSL Discrete Speech Features for Zipformer-based Contextual ASR
Figure 4 for Exploring SSL Discrete Speech Features for Zipformer-based Contextual ASR
Viaarxiv icon

GigaSpeech 2: An Evolving, Large-Scale and Multi-domain ASR Corpus for Low-Resource Languages with Automated Crawling, Transcription and Refinement

Add code
Jun 17, 2024
Viaarxiv icon

One-pass Multiple Conformer and Foundation Speech Systems Compression and Quantization Using An All-in-one Neural Model

Add code
Jun 14, 2024
Figure 1 for One-pass Multiple Conformer and Foundation Speech Systems Compression and Quantization Using An All-in-one Neural Model
Figure 2 for One-pass Multiple Conformer and Foundation Speech Systems Compression and Quantization Using An All-in-one Neural Model
Figure 3 for One-pass Multiple Conformer and Foundation Speech Systems Compression and Quantization Using An All-in-one Neural Model
Figure 4 for One-pass Multiple Conformer and Foundation Speech Systems Compression and Quantization Using An All-in-one Neural Model
Viaarxiv icon

Towards Effective and Efficient Non-autoregressive Decoding Using Block-based Attention Mask

Add code
Jun 14, 2024
Figure 1 for Towards Effective and Efficient Non-autoregressive Decoding Using Block-based Attention Mask
Figure 2 for Towards Effective and Efficient Non-autoregressive Decoding Using Block-based Attention Mask
Figure 3 for Towards Effective and Efficient Non-autoregressive Decoding Using Block-based Attention Mask
Figure 4 for Towards Effective and Efficient Non-autoregressive Decoding Using Block-based Attention Mask
Viaarxiv icon

Cross-Speaker Encoding Network for Multi-Talker Speech Recognition

Add code
Jan 08, 2024
Viaarxiv icon

Audio-visual End-to-end Multi-channel Speech Separation, Dereverberation and Recognition

Add code
Jul 06, 2023
Figure 1 for Audio-visual End-to-end Multi-channel Speech Separation, Dereverberation and Recognition
Figure 2 for Audio-visual End-to-end Multi-channel Speech Separation, Dereverberation and Recognition
Figure 3 for Audio-visual End-to-end Multi-channel Speech Separation, Dereverberation and Recognition
Figure 4 for Audio-visual End-to-end Multi-channel Speech Separation, Dereverberation and Recognition
Viaarxiv icon