Picture for Sirui Zhao

Sirui Zhao

T2Vid: Translating Long Text into Multi-Image is the Catalyst for Video-LLMs

Add code
Dec 02, 2024
Viaarxiv icon

MME-Survey: A Comprehensive Survey on Evaluation of Multimodal LLMs

Add code
Nov 22, 2024
Figure 1 for MME-Survey: A Comprehensive Survey on Evaluation of Multimodal LLMs
Figure 2 for MME-Survey: A Comprehensive Survey on Evaluation of Multimodal LLMs
Figure 3 for MME-Survey: A Comprehensive Survey on Evaluation of Multimodal LLMs
Figure 4 for MME-Survey: A Comprehensive Survey on Evaluation of Multimodal LLMs
Viaarxiv icon

Exploring User Retrieval Integration towards Large Language Models for Cross-Domain Sequential Recommendation

Add code
Jun 05, 2024
Viaarxiv icon

Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis

Add code
May 31, 2024
Figure 1 for Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
Figure 2 for Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
Figure 3 for Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
Figure 4 for Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
Viaarxiv icon

Dataset Regeneration for Sequential Recommendation

Add code
May 28, 2024
Viaarxiv icon

Learning Partially Aligned Item Representation for Cross-Domain Sequential Recommendation

Add code
May 21, 2024
Viaarxiv icon

A Challenger to GPT-4V? Early Explorations of Gemini in Visual Expertise

Add code
Dec 20, 2023
Figure 1 for A Challenger to GPT-4V? Early Explorations of Gemini in Visual Expertise
Figure 2 for A Challenger to GPT-4V? Early Explorations of Gemini in Visual Expertise
Figure 3 for A Challenger to GPT-4V? Early Explorations of Gemini in Visual Expertise
Figure 4 for A Challenger to GPT-4V? Early Explorations of Gemini in Visual Expertise
Viaarxiv icon

APGL4SR: A Generic Framework with Adaptive and Personalized Global Collaborative Information in Sequential Recommendation

Add code
Nov 06, 2023
Viaarxiv icon

Woodpecker: Hallucination Correction for Multimodal Large Language Models

Add code
Oct 24, 2023
Viaarxiv icon

A Solution to CVPR'2023 AQTC Challenge: Video Alignment for Multi-Step Inference

Add code
Jun 26, 2023
Viaarxiv icon