Picture for Ashishkumar Gudmalwar

Ashishkumar Gudmalwar

EmoReg: Directional Latent Vector Modeling for Emotional Intensity Regularization in Diffusion-based Voice Conversion

Add code
Dec 29, 2024
Viaarxiv icon

DubWise: Video-Guided Speech Duration Control in Multimodal LLM-based Text-to-Speech for Dubbing

Add code
Jun 13, 2024
Figure 1 for DubWise: Video-Guided Speech Duration Control in Multimodal LLM-based Text-to-Speech for Dubbing
Figure 2 for DubWise: Video-Guided Speech Duration Control in Multimodal LLM-based Text-to-Speech for Dubbing
Figure 3 for DubWise: Video-Guided Speech Duration Control in Multimodal LLM-based Text-to-Speech for Dubbing
Figure 4 for DubWise: Video-Guided Speech Duration Control in Multimodal LLM-based Text-to-Speech for Dubbing
Viaarxiv icon

VECL-TTS: Voice identity and Emotional style controllable Cross-Lingual Text-to-Speech

Add code
Jun 12, 2024
Figure 1 for VECL-TTS: Voice identity and Emotional style controllable Cross-Lingual Text-to-Speech
Figure 2 for VECL-TTS: Voice identity and Emotional style controllable Cross-Lingual Text-to-Speech
Figure 3 for VECL-TTS: Voice identity and Emotional style controllable Cross-Lingual Text-to-Speech
Viaarxiv icon