Picture for Anurag Kumar

Anurag Kumar

High-Quality Sound Separation Across Diverse Categories via Visually-Guided Generative Modeling

Add code
Sep 26, 2025
Viaarxiv icon

Interspeech 2025 URGENT Speech Enhancement Challenge

Add code
May 29, 2025
Viaarxiv icon

Learning to Highlight Audio by Watching Movies

Add code
May 17, 2025
Viaarxiv icon

Hearing Anywhere in Any Environment

Add code
Apr 14, 2025
Viaarxiv icon

Quickest change detection for UAV-based sensing

Add code
Apr 10, 2025
Figure 1 for Quickest change detection for UAV-based sensing
Figure 2 for Quickest change detection for UAV-based sensing
Figure 3 for Quickest change detection for UAV-based sensing
Figure 4 for Quickest change detection for UAV-based sensing
Viaarxiv icon

Efficient Audiovisual Speech Processing via MUTUD: Multimodal Training and Unimodal Deployment

Add code
Jan 30, 2025
Figure 1 for Efficient Audiovisual Speech Processing via MUTUD: Multimodal Training and Unimodal Deployment
Figure 2 for Efficient Audiovisual Speech Processing via MUTUD: Multimodal Training and Unimodal Deployment
Figure 3 for Efficient Audiovisual Speech Processing via MUTUD: Multimodal Training and Unimodal Deployment
Figure 4 for Efficient Audiovisual Speech Processing via MUTUD: Multimodal Training and Unimodal Deployment
Viaarxiv icon

SEAL: Speaker Error Correction using Acoustic-conditioned Large Language Models

Add code
Jan 14, 2025
Figure 1 for SEAL: Speaker Error Correction using Acoustic-conditioned Large Language Models
Figure 2 for SEAL: Speaker Error Correction using Acoustic-conditioned Large Language Models
Figure 3 for SEAL: Speaker Error Correction using Acoustic-conditioned Large Language Models
Figure 4 for SEAL: Speaker Error Correction using Acoustic-conditioned Large Language Models
Viaarxiv icon

Bridging Context Gaps: Enhancing Comprehension in Long-Form Social Conversations Through Contextualized Excerpts

Add code
Dec 28, 2024
Viaarxiv icon

Scaling Concept With Text-Guided Diffusion Models

Add code
Oct 31, 2024
Figure 1 for Scaling Concept With Text-Guided Diffusion Models
Figure 2 for Scaling Concept With Text-Guided Diffusion Models
Figure 3 for Scaling Concept With Text-Guided Diffusion Models
Figure 4 for Scaling Concept With Text-Guided Diffusion Models
Viaarxiv icon

Using RLHF to align speech enhancement approaches to mean-opinion quality scores

Add code
Oct 17, 2024
Viaarxiv icon