Picture for Boya Wu

Boya Wu

Emu3: Next-Token Prediction is All You Need

Add code
Sep 27, 2024
Viaarxiv icon

MLVU: A Comprehensive Benchmark for Multi-Task Long Video Understanding

Add code
Jun 06, 2024
Figure 1 for MLVU: A Comprehensive Benchmark for Multi-Task Long Video Understanding
Figure 2 for MLVU: A Comprehensive Benchmark for Multi-Task Long Video Understanding
Figure 3 for MLVU: A Comprehensive Benchmark for Multi-Task Long Video Understanding
Figure 4 for MLVU: A Comprehensive Benchmark for Multi-Task Long Video Understanding
Viaarxiv icon

Efficient Multimodal Learning from Data-centric Perspective

Add code
Feb 18, 2024
Viaarxiv icon

SVIT: Scaling up Visual Instruction Tuning

Add code
Jul 09, 2023
Viaarxiv icon