Picture for Mengzhe Geng

Mengzhe Geng

Phone-purity Guided Discrete Tokens for Dysarthric Speech Recognition

Add code
Jan 08, 2025
Viaarxiv icon

Effective and Efficient Mixed Precision Quantization of Speech Foundation Models

Add code
Jan 07, 2025
Figure 1 for Effective and Efficient Mixed Precision Quantization of Speech Foundation Models
Figure 2 for Effective and Efficient Mixed Precision Quantization of Speech Foundation Models
Figure 3 for Effective and Efficient Mixed Precision Quantization of Speech Foundation Models
Figure 4 for Effective and Efficient Mixed Precision Quantization of Speech Foundation Models
Viaarxiv icon

Structured Speaker-Deficiency Adaptation of Foundation Models for Dysarthric and Elderly Speech Recognition

Add code
Dec 25, 2024
Viaarxiv icon

Homogeneous Speaker Features for On-the-Fly Dysarthric and Elderly Speaker Adaptation

Add code
Jul 08, 2024
Figure 1 for Homogeneous Speaker Features for On-the-Fly Dysarthric and Elderly Speaker Adaptation
Figure 2 for Homogeneous Speaker Features for On-the-Fly Dysarthric and Elderly Speaker Adaptation
Figure 3 for Homogeneous Speaker Features for On-the-Fly Dysarthric and Elderly Speaker Adaptation
Figure 4 for Homogeneous Speaker Features for On-the-Fly Dysarthric and Elderly Speaker Adaptation
Viaarxiv icon

Joint Speaker Features Learning for Audio-visual Multichannel Speech Separation and Recognition

Add code
Jun 14, 2024
Figure 1 for Joint Speaker Features Learning for Audio-visual Multichannel Speech Separation and Recognition
Figure 2 for Joint Speaker Features Learning for Audio-visual Multichannel Speech Separation and Recognition
Figure 3 for Joint Speaker Features Learning for Audio-visual Multichannel Speech Separation and Recognition
Figure 4 for Joint Speaker Features Learning for Audio-visual Multichannel Speech Separation and Recognition
Viaarxiv icon

One-pass Multiple Conformer and Foundation Speech Systems Compression and Quantization Using An All-in-one Neural Model

Add code
Jun 14, 2024
Figure 1 for One-pass Multiple Conformer and Foundation Speech Systems Compression and Quantization Using An All-in-one Neural Model
Figure 2 for One-pass Multiple Conformer and Foundation Speech Systems Compression and Quantization Using An All-in-one Neural Model
Figure 3 for One-pass Multiple Conformer and Foundation Speech Systems Compression and Quantization Using An All-in-one Neural Model
Figure 4 for One-pass Multiple Conformer and Foundation Speech Systems Compression and Quantization Using An All-in-one Neural Model
Viaarxiv icon

Towards Effective and Efficient Non-autoregressive Decoding Using Block-based Attention Mask

Add code
Jun 14, 2024
Figure 1 for Towards Effective and Efficient Non-autoregressive Decoding Using Block-based Attention Mask
Figure 2 for Towards Effective and Efficient Non-autoregressive Decoding Using Block-based Attention Mask
Figure 3 for Towards Effective and Efficient Non-autoregressive Decoding Using Block-based Attention Mask
Figure 4 for Towards Effective and Efficient Non-autoregressive Decoding Using Block-based Attention Mask
Viaarxiv icon

An Initial Investigation of Language Adaptation for TTS Systems under Low-resource Scenarios

Add code
Jun 13, 2024
Viaarxiv icon

Enhancing Pre-trained ASR System Fine-tuning for Dysarthric Speech Recognition using Adversarial Data Augmentation

Add code
Jan 01, 2024
Viaarxiv icon

A Survey of Reasoning with Foundation Models

Add code
Dec 26, 2023
Figure 1 for A Survey of Reasoning with Foundation Models
Figure 2 for A Survey of Reasoning with Foundation Models
Figure 3 for A Survey of Reasoning with Foundation Models
Figure 4 for A Survey of Reasoning with Foundation Models
Viaarxiv icon