Picture for Ya Li

Ya Li

Mel-Refine: A Plug-and-Play Approach to Refine Mel-Spectrogram in Audio Generation

Add code
Dec 11, 2024
Viaarxiv icon

Enhancing Modal Fusion by Alignment and Label Matching for Multimodal Emotion Recognition

Add code
Aug 18, 2024
Viaarxiv icon

SPA-SVC: Self-supervised Pitch Augmentation for Singing Voice Conversion

Add code
Jun 09, 2024
Viaarxiv icon

Improving Audio Codec-based Zero-Shot Text-to-Speech Synthesis with Multi-Modal Context and Large Language Model

Add code
Jun 06, 2024
Viaarxiv icon

Retrieval Augmented Generation in Prompt-based Text-to-Speech Synthesis with Context-Aware Contrastive Language-Audio Pretraining

Add code
Jun 06, 2024
Viaarxiv icon

Auffusion: Leveraging the Power of Diffusion and Large Language Models for Text-to-Audio Generation

Add code
Jan 02, 2024
Viaarxiv icon

CRB Minimization for RIS-aided mmWave Integrated Sensing and Communications

Add code
Jan 02, 2024
Viaarxiv icon

Frame-level emotional state alignment method for speech emotion recognition

Add code
Dec 27, 2023
Viaarxiv icon

Hypergraph Enhanced Knowledge Tree Prompt Learning for Next-Basket Recommendation

Add code
Dec 26, 2023
Viaarxiv icon

CONCSS: Contrastive-based Context Comprehension for Dialogue-appropriate Prosody in Conversational Speech Synthesis

Add code
Dec 16, 2023
Viaarxiv icon