Picture for Zhoufutu Wen

Zhoufutu Wen

SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines

Add code
Feb 20, 2025
Viaarxiv icon

SimpleVQA: Multimodal Factuality Evaluation for Multimodal Large Language Models

Add code
Feb 18, 2025
Viaarxiv icon

CryptoX : Compositional Reasoning Evaluation of Large Language Models

Add code
Feb 08, 2025
Viaarxiv icon

Enhancing Dynamic Image Advertising with Vision-Language Pre-training

Add code
Jun 25, 2023
Viaarxiv icon