Picture for Chongjia Ni

Chongjia Ni

MinMo: A Multimodal Large Language Model for Seamless Voice Interaction

Add code
Jan 10, 2025
Viaarxiv icon

Emotional Dimension Control in Language Model-Based Text-to-Speech: Spanning a Broad Spectrum of Human Emotions

Add code
Sep 25, 2024
Viaarxiv icon

Phonetic Enhanced Language Modeling for Text-to-Speech Synthesis

Add code
Jun 04, 2024
Viaarxiv icon

MossFormer2: Combining Transformer and RNN-Free Recurrent Network for Enhanced Time-Domain Monaural Speech Separation

Add code
Dec 19, 2023
Viaarxiv icon

SPGM: Prioritizing Local Features for enhanced speech separation performance

Add code
Sep 22, 2023
Figure 1 for SPGM: Prioritizing Local Features for enhanced speech separation performance
Figure 2 for SPGM: Prioritizing Local Features for enhanced speech separation performance
Figure 3 for SPGM: Prioritizing Local Features for enhanced speech separation performance
Figure 4 for SPGM: Prioritizing Local Features for enhanced speech separation performance
Viaarxiv icon

Are Soft Prompts Good Zero-shot Learners for Speech Recognition?

Add code
Sep 18, 2023
Viaarxiv icon

ACA-Net: Towards Lightweight Speaker Verification using Asymmetric Cross Attention

Add code
May 20, 2023
Viaarxiv icon

Contrastive Speech Mixup for Low-resource Keyword Spotting

Add code
May 02, 2023
Viaarxiv icon

deHuBERT: Disentangling Noise in a Self-supervised Model for Robust Speech Recognition

Add code
Feb 28, 2023
Viaarxiv icon

Cloud-based Automatic Speech Recognition Systems for Southeast Asian Languages

Add code
Oct 07, 2022
Figure 1 for Cloud-based Automatic Speech Recognition Systems for Southeast Asian Languages
Figure 2 for Cloud-based Automatic Speech Recognition Systems for Southeast Asian Languages
Figure 3 for Cloud-based Automatic Speech Recognition Systems for Southeast Asian Languages
Figure 4 for Cloud-based Automatic Speech Recognition Systems for Southeast Asian Languages
Viaarxiv icon