Picture for Manthan Thakker

Manthan Thakker

Laugh Now Cry Later: Controlling Time-Varying Emotional States of Flow-Matching-Based Zero-Shot Text-to-Speech

Add code
Jul 17, 2024
Figure 1 for Laugh Now Cry Later: Controlling Time-Varying Emotional States of Flow-Matching-Based Zero-Shot Text-to-Speech
Figure 2 for Laugh Now Cry Later: Controlling Time-Varying Emotional States of Flow-Matching-Based Zero-Shot Text-to-Speech
Figure 3 for Laugh Now Cry Later: Controlling Time-Varying Emotional States of Flow-Matching-Based Zero-Shot Text-to-Speech
Figure 4 for Laugh Now Cry Later: Controlling Time-Varying Emotional States of Flow-Matching-Based Zero-Shot Text-to-Speech
Viaarxiv icon

E2 TTS: Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS

Add code
Jun 26, 2024
Figure 1 for E2 TTS: Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS
Figure 2 for E2 TTS: Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS
Figure 3 for E2 TTS: Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS
Figure 4 for E2 TTS: Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS
Viaarxiv icon

An Investigation of Noise Robustness for Flow-Matching-Based Zero-Shot TTS

Add code
Jun 09, 2024
Figure 1 for An Investigation of Noise Robustness for Flow-Matching-Based Zero-Shot TTS
Figure 2 for An Investigation of Noise Robustness for Flow-Matching-Based Zero-Shot TTS
Figure 3 for An Investigation of Noise Robustness for Flow-Matching-Based Zero-Shot TTS
Viaarxiv icon

Total-Duration-Aware Duration Modeling for Text-to-Speech Systems

Add code
Jun 06, 2024
Viaarxiv icon

Making Flow-Matching-Based Zero-Shot Text-to-Speech Laugh as You Like

Add code
Feb 12, 2024
Viaarxiv icon

SpeechX: Neural Codec Language Model as a Versatile Speech Transformer

Add code
Aug 14, 2023
Viaarxiv icon

Fast Real-time Personalized Speech Enhancement: End-to-End Enhancement Network (E3Net) and Knowledge Distillation

Add code
Apr 02, 2022
Figure 1 for Fast Real-time Personalized Speech Enhancement: End-to-End Enhancement Network (E3Net) and Knowledge Distillation
Figure 2 for Fast Real-time Personalized Speech Enhancement: End-to-End Enhancement Network (E3Net) and Knowledge Distillation
Figure 3 for Fast Real-time Personalized Speech Enhancement: End-to-End Enhancement Network (E3Net) and Knowledge Distillation
Viaarxiv icon

ICASSP 2022 Deep Noise Suppression Challenge

Add code
Feb 27, 2022
Figure 1 for ICASSP 2022 Deep Noise Suppression Challenge
Viaarxiv icon