Picture for Rabiul Awal

Rabiul Awal

BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks

Add code
Dec 05, 2024
Viaarxiv icon

Introducing Milabench: Benchmarking Accelerators for AI

Add code
Nov 18, 2024
Viaarxiv icon

VisMin: Visual Minimal-Change Understanding

Add code
Jul 23, 2024
Figure 1 for VisMin: Visual Minimal-Change Understanding
Figure 2 for VisMin: Visual Minimal-Change Understanding
Figure 3 for VisMin: Visual Minimal-Change Understanding
Figure 4 for VisMin: Visual Minimal-Change Understanding
Viaarxiv icon

Benchmarking Vision Language Models for Cultural Understanding

Add code
Jul 15, 2024
Viaarxiv icon

Contrasting Intra-Modal and Ranking Cross-Modal Hard Negatives to Enhance Visio-Linguistic Fine-grained Understanding

Add code
Jul 02, 2023
Viaarxiv icon

Investigating Prompting Techniques for Zero- and Few-Shot Visual Question Answering

Add code
Jun 16, 2023
Viaarxiv icon