Picture for Fanqing Meng

Fanqing Meng

Towards World Simulator: Crafting Physical Commonsense-Based Benchmark for Video Generation

Add code
Oct 07, 2024
Viaarxiv icon

MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models

Add code
Aug 05, 2024
Figure 1 for MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models
Figure 2 for MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models
Figure 3 for MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models
Figure 4 for MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models
Viaarxiv icon

PhyBench: A Physical Commonsense Benchmark for Evaluating Text-to-Image Models

Add code
Jun 17, 2024
Viaarxiv icon

GUI Odyssey: A Comprehensive Dataset for Cross-App GUI Navigation on Mobile Devices

Add code
Jun 12, 2024
Figure 1 for GUI Odyssey: A Comprehensive Dataset for Cross-App GUI Navigation on Mobile Devices
Figure 2 for GUI Odyssey: A Comprehensive Dataset for Cross-App GUI Navigation on Mobile Devices
Figure 3 for GUI Odyssey: A Comprehensive Dataset for Cross-App GUI Navigation on Mobile Devices
Figure 4 for GUI Odyssey: A Comprehensive Dataset for Cross-App GUI Navigation on Mobile Devices
Viaarxiv icon

MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI

Add code
Apr 24, 2024
Viaarxiv icon

ChartAssisstant: A Universal Chart Multimodal Language Model via Chart-to-Table Pre-training and Multitask Instruction Tuning

Add code
Jan 10, 2024
Viaarxiv icon

Foundation Model is Efficient Multimodal Multitask Model Selector

Add code
Aug 11, 2023
Figure 1 for Foundation Model is Efficient Multimodal Multitask Model Selector
Figure 2 for Foundation Model is Efficient Multimodal Multitask Model Selector
Figure 3 for Foundation Model is Efficient Multimodal Multitask Model Selector
Figure 4 for Foundation Model is Efficient Multimodal Multitask Model Selector
Viaarxiv icon

Tiny LVLM-eHub: Early Multimodal Experiments with Bard

Add code
Aug 07, 2023
Viaarxiv icon

LVLM-eHub: A Comprehensive Evaluation Benchmark for Large Vision-Language Models

Add code
Jun 15, 2023
Viaarxiv icon