Picture for Yifan Mai

Yifan Mai

Image2Struct: Benchmarking Structure Extraction for Vision-Language Models

Add code
Oct 29, 2024
Figure 1 for Image2Struct: Benchmarking Structure Extraction for Vision-Language Models
Figure 2 for Image2Struct: Benchmarking Structure Extraction for Vision-Language Models
Figure 3 for Image2Struct: Benchmarking Structure Extraction for Vision-Language Models
Figure 4 for Image2Struct: Benchmarking Structure Extraction for Vision-Language Models
Viaarxiv icon

Language model developers should report train-test overlap

Add code
Oct 10, 2024
Viaarxiv icon

VHELM: A Holistic Evaluation of Vision Language Models

Add code
Oct 09, 2024
Figure 1 for VHELM: A Holistic Evaluation of Vision Language Models
Figure 2 for VHELM: A Holistic Evaluation of Vision Language Models
Figure 3 for VHELM: A Holistic Evaluation of Vision Language Models
Figure 4 for VHELM: A Holistic Evaluation of Vision Language Models
Viaarxiv icon

Introducing v0.5 of the AI Safety Benchmark from MLCommons

Add code
Apr 18, 2024
Figure 1 for Introducing v0.5 of the AI Safety Benchmark from MLCommons
Figure 2 for Introducing v0.5 of the AI Safety Benchmark from MLCommons
Figure 3 for Introducing v0.5 of the AI Safety Benchmark from MLCommons
Figure 4 for Introducing v0.5 of the AI Safety Benchmark from MLCommons
Viaarxiv icon

Holistic Evaluation of Text-To-Image Models

Add code
Nov 07, 2023
Viaarxiv icon

Holistic Evaluation of Language Models

Add code
Nov 16, 2022
Figure 1 for Holistic Evaluation of Language Models
Figure 2 for Holistic Evaluation of Language Models
Figure 3 for Holistic Evaluation of Language Models
Figure 4 for Holistic Evaluation of Language Models
Viaarxiv icon