Picture for Zhoufutu Wen

Zhoufutu Wen

IV-Bench: A Benchmark for Image-Grounded Video Perception and Reasoning in Multimodal LLMs

Add code
Apr 21, 2025
Viaarxiv icon

SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines

Add code
Feb 20, 2025
Viaarxiv icon

SimpleVQA: Multimodal Factuality Evaluation for Multimodal Large Language Models

Add code
Feb 18, 2025
Viaarxiv icon

CryptoX : Compositional Reasoning Evaluation of Large Language Models

Add code
Feb 08, 2025
Viaarxiv icon

Enhancing Dynamic Image Advertising with Vision-Language Pre-training

Add code
Jun 25, 2023
Viaarxiv icon