Picture for Yinghao Aaron Li

Yinghao Aaron Li

StyleTTS-ZS: Efficient High-Quality Zero-Shot Text-to-Speech Synthesis with Distilled Time-Varying Style Diffusion

Add code
Sep 16, 2024
Viaarxiv icon

Style-Talker: Finetuning Audio Language Model and Style-Based Text-to-Speech Model for Fast Spoken Dialogue Generation

Add code
Aug 13, 2024
Viaarxiv icon

Speech Slytherin: Examining the Performance and Efficiency of Mamba for Speech Separation, Recognition, and Synthesis

Add code
Jul 13, 2024
Viaarxiv icon

Listen, Chat, and Edit: Text-Guided Soundscape Modification for Enhanced Auditory Experience

Add code
Feb 06, 2024
Viaarxiv icon

Contextual Feature Extraction Hierarchies Converge in Large Language Models and the Brain

Add code
Jan 31, 2024
Viaarxiv icon

Exploring Self-Supervised Contrastive Learning of Spatial Sound Event Representation

Add code
Sep 27, 2023
Viaarxiv icon

HiFTNet: A Fast High-Quality Neural Vocoder with Harmonic-plus-Noise Filter and Inverse Short Time Fourier Transform

Add code
Sep 18, 2023
Viaarxiv icon

SLMGAN: Exploiting Speech Language Model Representations for Unsupervised Zero-Shot Voice Conversion in GANs

Add code
Jul 18, 2023
Viaarxiv icon

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

Add code
Jun 13, 2023
Viaarxiv icon

DeCoR: Defy Knowledge Forgetting by Predicting Earlier Audio Codes

Add code
May 29, 2023
Viaarxiv icon