Picture for Yichong Leng

Yichong Leng

Qwen2-Audio Technical Report

Add code
Jul 15, 2024
Viaarxiv icon

Sentence-Level or Token-Level? A Comprehensive Study on Knowledge Distillation

Add code
Apr 23, 2024
Viaarxiv icon

NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models

Add code
Mar 05, 2024
Viaarxiv icon

AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension

Add code
Feb 12, 2024
Viaarxiv icon

PromptTTS 2: Describing and Generating Voices with Text Prompt

Add code
Sep 05, 2023
Viaarxiv icon

Extract and Attend: Improving Entity Translation in Neural Machine Translation

Add code
Jun 04, 2023
Viaarxiv icon

NaturalSpeech 2: Latent Diffusion Models are Natural and Zero-Shot Speech and Singing Synthesizers

Add code
May 04, 2023
Viaarxiv icon

ResGrad: Residual Denoising Diffusion Probabilistic Models for Text to Speech

Add code
Dec 30, 2022
Viaarxiv icon

SoftCorrect: Error Correction with Soft Detection for Automatic Speech Recognition

Add code
Dec 02, 2022
Viaarxiv icon

Mask the Correct Tokens: An Embarrassingly Simple Approach for Error Correction

Add code
Nov 23, 2022
Viaarxiv icon