Picture for Shukang Yin

Shukang Yin

T2Vid: Translating Long Text into Multi-Image is the Catalyst for Video-LLMs

Add code
Dec 02, 2024
Viaarxiv icon

MME-Survey: A Comprehensive Survey on Evaluation of Multimodal LLMs

Add code
Nov 22, 2024
Figure 1 for MME-Survey: A Comprehensive Survey on Evaluation of Multimodal LLMs
Figure 2 for MME-Survey: A Comprehensive Survey on Evaluation of Multimodal LLMs
Figure 3 for MME-Survey: A Comprehensive Survey on Evaluation of Multimodal LLMs
Figure 4 for MME-Survey: A Comprehensive Survey on Evaluation of Multimodal LLMs
Viaarxiv icon

Woodpecker: Hallucination Correction for Multimodal Large Language Models

Add code
Oct 24, 2023
Viaarxiv icon

A Survey on Multimodal Large Language Models

Add code
Jun 23, 2023
Viaarxiv icon

AU-aware graph convolutional network for Macro- and Micro-expression spotting

Add code
Mar 16, 2023
Viaarxiv icon