Picture for Erhan Bas

Erhan Bas

Let's Go Shopping -- Web-Scale Image-Text Dataset for Visual Concept Understanding

Add code
Jan 09, 2024
Viaarxiv icon

On the Performance of Multimodal Language Models

Add code
Oct 04, 2023
Viaarxiv icon

Detecting and Preventing Hallucinations in Large Vision Language Models

Add code
Aug 18, 2023
Figure 1 for Detecting and Preventing Hallucinations in Large Vision Language Models
Figure 2 for Detecting and Preventing Hallucinations in Large Vision Language Models
Figure 3 for Detecting and Preventing Hallucinations in Large Vision Language Models
Figure 4 for Detecting and Preventing Hallucinations in Large Vision Language Models
Viaarxiv icon

Masked Vision and Language Modeling for Multi-modal Representation Learning

Add code
Aug 03, 2022
Figure 1 for Masked Vision and Language Modeling for Multi-modal Representation Learning
Figure 2 for Masked Vision and Language Modeling for Multi-modal Representation Learning
Figure 3 for Masked Vision and Language Modeling for Multi-modal Representation Learning
Figure 4 for Masked Vision and Language Modeling for Multi-modal Representation Learning
Viaarxiv icon

X-DETR: A Versatile Architecture for Instance-wise Vision-Language Tasks

Add code
Apr 12, 2022
Figure 1 for X-DETR: A Versatile Architecture for Instance-wise Vision-Language Tasks
Figure 2 for X-DETR: A Versatile Architecture for Instance-wise Vision-Language Tasks
Figure 3 for X-DETR: A Versatile Architecture for Instance-wise Vision-Language Tasks
Figure 4 for X-DETR: A Versatile Architecture for Instance-wise Vision-Language Tasks
Viaarxiv icon