Picture for Yuhan Dai

Yuhan Dai

T2Vid: Translating Long Text into Multi-Image is the Catalyst for Video-LLMs

Add code
Dec 02, 2024
Figure 1 for T2Vid: Translating Long Text into Multi-Image is the Catalyst for Video-LLMs
Figure 2 for T2Vid: Translating Long Text into Multi-Image is the Catalyst for Video-LLMs
Figure 3 for T2Vid: Translating Long Text into Multi-Image is the Catalyst for Video-LLMs
Figure 4 for T2Vid: Translating Long Text into Multi-Image is the Catalyst for Video-LLMs
Viaarxiv icon

Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis

Add code
May 31, 2024
Figure 1 for Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
Figure 2 for Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
Figure 3 for Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
Figure 4 for Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
Viaarxiv icon

Simple and Scalable Nearest Neighbor Machine Translation

Add code
Feb 23, 2023
Viaarxiv icon