Picture for Nirmesh Shah

Nirmesh Shah

DubWise: Video-Guided Speech Duration Control in Multimodal LLM-based Text-to-Speech for Dubbing

Add code
Jun 13, 2024
Viaarxiv icon

VECL-TTS: Voice identity and Emotional style controllable Cross-Lingual Text-to-Speech

Add code
Jun 12, 2024
Viaarxiv icon

Nonparallel Emotional Voice Conversion For Unseen Speaker-Emotion Pairs Using Dual Domain Adversarial Network & Virtual Domain Pairing

Add code
Feb 21, 2023
Viaarxiv icon

M2FNet: Multi-modal Fusion Network for Emotion Recognition in Conversation

Add code
Jun 05, 2022
Figure 1 for M2FNet: Multi-modal Fusion Network for Emotion Recognition in Conversation
Figure 2 for M2FNet: Multi-modal Fusion Network for Emotion Recognition in Conversation
Figure 3 for M2FNet: Multi-modal Fusion Network for Emotion Recognition in Conversation
Figure 4 for M2FNet: Multi-modal Fusion Network for Emotion Recognition in Conversation
Viaarxiv icon