Picture for Haodong Duan

Haodong Duan

InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions

Add code
Dec 12, 2024
Viaarxiv icon

MME-Survey: A Comprehensive Survey on Evaluation of Multimodal LLMs

Add code
Nov 22, 2024
Figure 1 for MME-Survey: A Comprehensive Survey on Evaluation of Multimodal LLMs
Figure 2 for MME-Survey: A Comprehensive Survey on Evaluation of Multimodal LLMs
Figure 3 for MME-Survey: A Comprehensive Survey on Evaluation of Multimodal LLMs
Figure 4 for MME-Survey: A Comprehensive Survey on Evaluation of Multimodal LLMs
Viaarxiv icon

MIA-DPO: Multi-Image Augmented Direct Preference Optimization For Large Vision-Language Models

Add code
Oct 23, 2024
Figure 1 for MIA-DPO: Multi-Image Augmented Direct Preference Optimization For Large Vision-Language Models
Figure 2 for MIA-DPO: Multi-Image Augmented Direct Preference Optimization For Large Vision-Language Models
Figure 3 for MIA-DPO: Multi-Image Augmented Direct Preference Optimization For Large Vision-Language Models
Figure 4 for MIA-DPO: Multi-Image Augmented Direct Preference Optimization For Large Vision-Language Models
Viaarxiv icon

CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution

Add code
Oct 21, 2024
Figure 1 for CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution
Figure 2 for CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution
Figure 3 for CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution
Figure 4 for CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution
Viaarxiv icon

ProSA: Assessing and Understanding the Prompt Sensitivity of LLMs

Add code
Oct 16, 2024
Viaarxiv icon

GMAI-MMBench: A Comprehensive Multimodal Evaluation Benchmark Towards General Medical AI

Add code
Aug 06, 2024
Viaarxiv icon

VLMEvalKit: An Open-Source Toolkit for Evaluating Large Multi-Modality Models

Add code
Jul 16, 2024
Figure 1 for VLMEvalKit: An Open-Source Toolkit for Evaluating Large Multi-Modality Models
Figure 2 for VLMEvalKit: An Open-Source Toolkit for Evaluating Large Multi-Modality Models
Figure 3 for VLMEvalKit: An Open-Source Toolkit for Evaluating Large Multi-Modality Models
Figure 4 for VLMEvalKit: An Open-Source Toolkit for Evaluating Large Multi-Modality Models
Viaarxiv icon

InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output

Add code
Jul 03, 2024
Figure 1 for InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
Figure 2 for InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
Figure 3 for InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
Figure 4 for InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
Viaarxiv icon

MG-LLaVA: Towards Multi-Granularity Visual Instruction Tuning

Add code
Jun 25, 2024
Viaarxiv icon

MMBench-Video: A Long-Form Multi-Shot Benchmark for Holistic Video Understanding

Add code
Jun 20, 2024
Viaarxiv icon