Picture for Sanath Narayan

Sanath Narayan

From Unimodal to Multimodal: Scaling up Projectors to Align Modalities

Add code
Sep 28, 2024
Figure 1 for From Unimodal to Multimodal: Scaling up Projectors to Align Modalities
Figure 2 for From Unimodal to Multimodal: Scaling up Projectors to Align Modalities
Figure 3 for From Unimodal to Multimodal: Scaling up Projectors to Align Modalities
Figure 4 for From Unimodal to Multimodal: Scaling up Projectors to Align Modalities
Viaarxiv icon

Falcon2-11B Technical Report

Add code
Jul 20, 2024
Viaarxiv icon

Open-Vocabulary Temporal Action Localization using Multimodal Guidance

Add code
Jun 21, 2024
Viaarxiv icon

Efficient 3D-Aware Facial Image Editing via Attribute-Specific Prompt Learning

Add code
Jun 06, 2024
Viaarxiv icon

Multi-modal Generation via Cross-Modal In-Context Learning

Add code
May 28, 2024
Viaarxiv icon

ViSpeR: Multilingual Audio-Visual Speech Recognition

Add code
May 27, 2024
Viaarxiv icon

Do Vision and Language Encoders Represent the World Similarly?

Add code
Jan 10, 2024
Viaarxiv icon

Do VSR Models Generalize Beyond LRS3?

Add code
Nov 23, 2023
Viaarxiv icon

Lip2Vec: Efficient and Robust Visual Speech Recognition via Latent-to-Latent Visual to Audio Representation Mapping

Add code
Aug 11, 2023
Viaarxiv icon

Remote Sensing Change Detection With Transformers Trained from Scratch

Add code
Apr 13, 2023
Viaarxiv icon