Picture for Sreyan Ghosh

Sreyan Ghosh

Nemotron 3 Nano Omni: Efficient and Open Multimodal Intelligence

Add code
Apr 27, 2026
Viaarxiv icon

Video-Robin: Autoregressive Diffusion Planning for Intent-Grounded Video-to-Music Generation

Add code
Apr 19, 2026
Viaarxiv icon

Audio Flamingo Next: Next-Generation Open Audio-Language Models for Speech, Sound, and Music

Add code
Apr 13, 2026
Viaarxiv icon

Do Audio-Visual Large Language Models Really See and Hear?

Add code
Apr 03, 2026
Viaarxiv icon

MMOU: A Massive Multi-Task Omni Understanding and Reasoning Benchmark for Long and Complex Real-World Videos

Add code
Mar 14, 2026
Viaarxiv icon

Speech-Hands: A Self-Reflection Voice Agentic Approach to Speech Recognition and Audio Reasoning with Omni Perception

Add code
Jan 14, 2026
Viaarxiv icon

Music Flamingo: Scaling Music Understanding in Audio Language Models

Add code
Nov 13, 2025
Viaarxiv icon

SPUR: A Plug-and-Play Framework for Integrating Spatial Audio Understanding and Reasoning into Large Audio-Language Models

Add code
Nov 13, 2025
Figure 1 for SPUR: A Plug-and-Play Framework for Integrating Spatial Audio Understanding and Reasoning into Large Audio-Language Models
Figure 2 for SPUR: A Plug-and-Play Framework for Integrating Spatial Audio Understanding and Reasoning into Large Audio-Language Models
Figure 3 for SPUR: A Plug-and-Play Framework for Integrating Spatial Audio Understanding and Reasoning into Large Audio-Language Models
Figure 4 for SPUR: A Plug-and-Play Framework for Integrating Spatial Audio Understanding and Reasoning into Large Audio-Language Models
Viaarxiv icon

Multi-Domain Audio Question Answering Toward Acoustic Content Reasoning in The DCASE 2025 Challenge

Add code
May 12, 2025
Viaarxiv icon

ProSE: Diffusion Priors for Speech Enhancement

Add code
Mar 09, 2025
Figure 1 for ProSE: Diffusion Priors for Speech Enhancement
Figure 2 for ProSE: Diffusion Priors for Speech Enhancement
Figure 3 for ProSE: Diffusion Priors for Speech Enhancement
Figure 4 for ProSE: Diffusion Priors for Speech Enhancement
Viaarxiv icon