Picture for Sanjoy Chowdhury

Sanjoy Chowdhury

Meerkat: Audio-Visual Large Language Model for Grounding in Space and Time

Add code
Jul 01, 2024
Viaarxiv icon

MeLFusion: Synthesizing Music from Image and Language Cues using Diffusion Models

Add code
Jun 07, 2024
Viaarxiv icon

Can LLMs Generate Human-Like Wayfinding Instructions? Towards Platform-Agnostic Embodied Instruction Synthesis

Add code
Mar 31, 2024
Viaarxiv icon

APoLLo: Unified Adapter and Prompt Learning for Vision Language Models

Add code
Dec 04, 2023
Viaarxiv icon

AdVerb: Visually Guided Audio Dereverberation

Add code
Aug 23, 2023
Viaarxiv icon

ASPIRE: Language-Guided Augmentation for Robust Image Classification

Add code
Aug 19, 2023
Viaarxiv icon

Measured Albedo in the Wild: Filling the Gap in Intrinsics Evaluation

Add code
Jun 29, 2023
Viaarxiv icon