Picture for Aishwarya Agrawal

Aishwarya Agrawal

VisMin: Visual Minimal-Change Understanding

Add code
Jul 23, 2024
Viaarxiv icon

Benchmarking Vision Language Models for Cultural Understanding

Add code
Jul 15, 2024
Viaarxiv icon

Decompose and Compare Consistency: Measuring VLMs' Answer Reliability via Task-Decomposition Consistency Comparison

Add code
Jul 10, 2024
Viaarxiv icon

An Introduction to Vision-Language Modeling

Add code
May 27, 2024
Figure 1 for An Introduction to Vision-Language Modeling
Figure 2 for An Introduction to Vision-Language Modeling
Figure 3 for An Introduction to Vision-Language Modeling
Viaarxiv icon

Improving Text-to-Image Consistency via Automatic Prompt Optimization

Add code
Mar 26, 2024
Viaarxiv icon

MoqaGPT : Zero-Shot Multi-modal Open-domain Question Answering with Large Language Model

Add code
Oct 20, 2023
Viaarxiv icon

Improving Automatic VQA Evaluation Using Large Language Models

Add code
Oct 04, 2023
Viaarxiv icon

Contrasting Intra-Modal and Ranking Cross-Modal Hard Negatives to Enhance Visio-Linguistic Fine-grained Understanding

Add code
Jul 02, 2023
Viaarxiv icon

Investigating Prompting Techniques for Zero- and Few-Shot Visual Question Answering

Add code
Jun 16, 2023
Viaarxiv icon

An Examination of the Robustness of Reference-Free Image Captioning Evaluation Metrics

Add code
May 24, 2023
Viaarxiv icon