Picture for Qiushi Zhu

Qiushi Zhu

DurIAN-E 2: Duration Informed Attention Network with Adaptive Variational Autoencoder and Adversarial Learning for Expressive Text-to-Speech Synthesis

Add code
Oct 17, 2024
Viaarxiv icon

Listen Again and Choose the Right Answer: A New Paradigm for Automatic Speech Recognition with Large Language Models

Add code
May 16, 2024
Viaarxiv icon

Multichannel AV-wav2vec2: A Framework for Learning Multichannel Multi-Modal Speech Representation

Add code
Jan 07, 2024
Viaarxiv icon

Rep2wav: Noise Robust text-to-speech Using self-supervised representations

Add code
Sep 04, 2023
Viaarxiv icon

Noise-aware Speech Enhancement using Diffusion Probabilistic Model

Add code
Jul 16, 2023
Viaarxiv icon

Hearing Lips in Noise: Universal Viseme-Phoneme Mapping and Transfer for Robust Audio-Visual Speech Recognition

Add code
Jun 18, 2023
Viaarxiv icon

Eeg2vec: Self-Supervised Electroencephalographic Representation Learning

Add code
May 23, 2023
Viaarxiv icon

Cross-Modal Global Interaction and Local Alignment for Audio-Visual Speech Recognition

Add code
May 16, 2023
Viaarxiv icon

Wav2code: Restore Clean Speech Representations via Codebook Lookup for Noise-Robust ASR

Add code
Apr 23, 2023
Viaarxiv icon

Gradient Remedy for Multi-Task Learning in End-to-End Noise-Robust Speech Recognition

Add code
Feb 22, 2023
Viaarxiv icon