Picture for Sayan Nag

Sayan Nag

SafaRi:Adaptive Sequence Transformer for Weakly Supervised Referring Expression Segmentation

Add code
Jul 02, 2024
Viaarxiv icon

Meerkat: Audio-Visual Large Language Model for Grounding in Space and Time

Add code
Jul 01, 2024
Viaarxiv icon

MeLFusion: Synthesizing Music from Image and Language Cues using Diffusion Models

Add code
Jun 07, 2024
Viaarxiv icon

Jack of All Tasks, Master of Many: Designing General-purpose Coarse-to-Fine Vision-Language Model

Add code
Dec 19, 2023
Viaarxiv icon

APoLLo: Unified Adapter and Prompt Learning for Vision Language Models

Add code
Dec 04, 2023
Viaarxiv icon

EgoVLPv2: Egocentric Video-Language Pre-training with Fusion in the Backbone

Add code
Jul 11, 2023
Viaarxiv icon

BeAts: Bengali Speech Acts Recognition using Multimodal Attention Fusion

Add code
Jun 05, 2023
Viaarxiv icon

An Evaluation of Non-Contrastive Self-Supervised Learning for Federated Medical Image Analysis

Add code
Mar 09, 2023
Viaarxiv icon

Exploring Self-Supervised Representation Learning For Low-Resource Medical Image Analysis

Add code
Mar 03, 2023
Viaarxiv icon

IDEAL: Improved DEnse locAL Contrastive Learning for Semi-Supervised Medical Image Segmentation

Add code
Oct 26, 2022
Viaarxiv icon