Picture for Pankaj Wasnik

Pankaj Wasnik

Faster Machine Translation Ensembling with Reinforcement Learning and Competitive Correction

Add code
Jan 25, 2025
Figure 1 for Faster Machine Translation Ensembling with Reinforcement Learning and Competitive Correction
Figure 2 for Faster Machine Translation Ensembling with Reinforcement Learning and Competitive Correction
Figure 3 for Faster Machine Translation Ensembling with Reinforcement Learning and Competitive Correction
Figure 4 for Faster Machine Translation Ensembling with Reinforcement Learning and Competitive Correction
Viaarxiv icon

Open-Set Object Detection By Aligning Known Class Representations

Add code
Dec 30, 2024
Viaarxiv icon

EmoReg: Directional Latent Vector Modeling for Emotional Intensity Regularization in Diffusion-based Voice Conversion

Add code
Dec 29, 2024
Viaarxiv icon

Enhancing Entertainment Translation for Indian Languages using Adaptive Context, Style and LLMs

Add code
Dec 29, 2024
Figure 1 for Enhancing Entertainment Translation for Indian Languages using Adaptive Context, Style and LLMs
Figure 2 for Enhancing Entertainment Translation for Indian Languages using Adaptive Context, Style and LLMs
Figure 3 for Enhancing Entertainment Translation for Indian Languages using Adaptive Context, Style and LLMs
Figure 4 for Enhancing Entertainment Translation for Indian Languages using Adaptive Context, Style and LLMs
Viaarxiv icon

Cross-Modal Fusion and Attention Mechanism for Weakly Supervised Video Anomaly Detection

Add code
Dec 29, 2024
Viaarxiv icon

Enhancing Whisper's Accuracy and Speed for Indian Languages through Prompt-Tuning and Tokenization

Add code
Dec 27, 2024
Viaarxiv icon

Beyond Few-shot Object Detection: A Detailed Survey

Add code
Aug 26, 2024
Viaarxiv icon

DubWise: Video-Guided Speech Duration Control in Multimodal LLM-based Text-to-Speech for Dubbing

Add code
Jun 13, 2024
Figure 1 for DubWise: Video-Guided Speech Duration Control in Multimodal LLM-based Text-to-Speech for Dubbing
Figure 2 for DubWise: Video-Guided Speech Duration Control in Multimodal LLM-based Text-to-Speech for Dubbing
Figure 3 for DubWise: Video-Guided Speech Duration Control in Multimodal LLM-based Text-to-Speech for Dubbing
Figure 4 for DubWise: Video-Guided Speech Duration Control in Multimodal LLM-based Text-to-Speech for Dubbing
Viaarxiv icon

VECL-TTS: Voice identity and Emotional style controllable Cross-Lingual Text-to-Speech

Add code
Jun 12, 2024
Figure 1 for VECL-TTS: Voice identity and Emotional style controllable Cross-Lingual Text-to-Speech
Figure 2 for VECL-TTS: Voice identity and Emotional style controllable Cross-Lingual Text-to-Speech
Figure 3 for VECL-TTS: Voice identity and Emotional style controllable Cross-Lingual Text-to-Speech
Viaarxiv icon

Efficient infusion of self-supervised representations in Automatic Speech Recognition

Add code
Apr 19, 2024
Figure 1 for Efficient infusion of self-supervised representations in Automatic Speech Recognition
Figure 2 for Efficient infusion of self-supervised representations in Automatic Speech Recognition
Figure 3 for Efficient infusion of self-supervised representations in Automatic Speech Recognition
Figure 4 for Efficient infusion of self-supervised representations in Automatic Speech Recognition
Viaarxiv icon