Picture for Aishwarya Agrawal

Aishwarya Agrawal

Assessing and Learning Alignment of Unimodal Vision and Language Models

Add code
Dec 05, 2024
Viaarxiv icon

VisMin: Visual Minimal-Change Understanding

Add code
Jul 23, 2024
Figure 1 for VisMin: Visual Minimal-Change Understanding
Figure 2 for VisMin: Visual Minimal-Change Understanding
Figure 3 for VisMin: Visual Minimal-Change Understanding
Figure 4 for VisMin: Visual Minimal-Change Understanding
Viaarxiv icon

Benchmarking Vision Language Models for Cultural Understanding

Add code
Jul 15, 2024
Viaarxiv icon

Decompose and Compare Consistency: Measuring VLMs' Answer Reliability via Task-Decomposition Consistency Comparison

Add code
Jul 10, 2024
Viaarxiv icon

An Introduction to Vision-Language Modeling

Add code
May 27, 2024
Figure 1 for An Introduction to Vision-Language Modeling
Figure 2 for An Introduction to Vision-Language Modeling
Figure 3 for An Introduction to Vision-Language Modeling
Viaarxiv icon

Improving Text-to-Image Consistency via Automatic Prompt Optimization

Add code
Mar 26, 2024
Figure 1 for Improving Text-to-Image Consistency via Automatic Prompt Optimization
Figure 2 for Improving Text-to-Image Consistency via Automatic Prompt Optimization
Figure 3 for Improving Text-to-Image Consistency via Automatic Prompt Optimization
Figure 4 for Improving Text-to-Image Consistency via Automatic Prompt Optimization
Viaarxiv icon

MoqaGPT : Zero-Shot Multi-modal Open-domain Question Answering with Large Language Model

Add code
Oct 20, 2023
Figure 1 for MoqaGPT : Zero-Shot Multi-modal Open-domain Question Answering with Large Language Model
Figure 2 for MoqaGPT : Zero-Shot Multi-modal Open-domain Question Answering with Large Language Model
Figure 3 for MoqaGPT : Zero-Shot Multi-modal Open-domain Question Answering with Large Language Model
Figure 4 for MoqaGPT : Zero-Shot Multi-modal Open-domain Question Answering with Large Language Model
Viaarxiv icon

Improving Automatic VQA Evaluation Using Large Language Models

Add code
Oct 04, 2023
Viaarxiv icon

Contrasting Intra-Modal and Ranking Cross-Modal Hard Negatives to Enhance Visio-Linguistic Fine-grained Understanding

Add code
Jul 02, 2023
Viaarxiv icon

Investigating Prompting Techniques for Zero- and Few-Shot Visual Question Answering

Add code
Jun 16, 2023
Viaarxiv icon