Picture for Shicheng Li

Shicheng Li

PunchBench: Benchmarking MLLMs in Multimodal Punchline Comprehension

Add code
Dec 16, 2024
Viaarxiv icon

TempCompass: Do Video LLMs Really Understand Videos?

Add code
Mar 01, 2024
Figure 1 for TempCompass: Do Video LLMs Really Understand Videos?
Figure 2 for TempCompass: Do Video LLMs Really Understand Videos?
Figure 3 for TempCompass: Do Video LLMs Really Understand Videos?
Figure 4 for TempCompass: Do Video LLMs Really Understand Videos?
Viaarxiv icon

TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding

Add code
Dec 04, 2023
Figure 1 for TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding
Figure 2 for TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding
Figure 3 for TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding
Figure 4 for TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding
Viaarxiv icon

VITATECS: A Diagnostic Dataset for Temporal Concept Understanding of Video-Language Models

Add code
Nov 29, 2023
Figure 1 for VITATECS: A Diagnostic Dataset for Temporal Concept Understanding of Video-Language Models
Figure 2 for VITATECS: A Diagnostic Dataset for Temporal Concept Understanding of Video-Language Models
Figure 3 for VITATECS: A Diagnostic Dataset for Temporal Concept Understanding of Video-Language Models
Figure 4 for VITATECS: A Diagnostic Dataset for Temporal Concept Understanding of Video-Language Models
Viaarxiv icon

RECALL: A Benchmark for LLMs Robustness against External Counterfactual Knowledge

Add code
Nov 14, 2023
Figure 1 for RECALL: A Benchmark for LLMs Robustness against External Counterfactual Knowledge
Figure 2 for RECALL: A Benchmark for LLMs Robustness against External Counterfactual Knowledge
Figure 3 for RECALL: A Benchmark for LLMs Robustness against External Counterfactual Knowledge
Figure 4 for RECALL: A Benchmark for LLMs Robustness against External Counterfactual Knowledge
Viaarxiv icon

FETV: A Benchmark for Fine-Grained Evaluation of Open-Domain Text-to-Video Generation

Add code
Nov 08, 2023
Figure 1 for FETV: A Benchmark for Fine-Grained Evaluation of Open-Domain Text-to-Video Generation
Figure 2 for FETV: A Benchmark for Fine-Grained Evaluation of Open-Domain Text-to-Video Generation
Figure 3 for FETV: A Benchmark for Fine-Grained Evaluation of Open-Domain Text-to-Video Generation
Figure 4 for FETV: A Benchmark for Fine-Grained Evaluation of Open-Domain Text-to-Video Generation
Viaarxiv icon

TESTA: Temporal-Spatial Token Aggregation for Long-form Video-Language Understanding

Add code
Oct 29, 2023
Viaarxiv icon

M$^3$IT: A Large-Scale Dataset towards Multi-Modal Multilingual Instruction Tuning

Add code
Jun 08, 2023
Viaarxiv icon

Optimizing Energy Efficiency in Metro Systems Under Uncertainty Disturbances Using Reinforcement Learning

Add code
Apr 27, 2023
Viaarxiv icon

CAPT: Contrastive Pre-Training for Learning Denoised Sequence Representations

Add code
Oct 30, 2020
Figure 1 for CAPT: Contrastive Pre-Training for Learning Denoised Sequence Representations
Figure 2 for CAPT: Contrastive Pre-Training for Learning Denoised Sequence Representations
Figure 3 for CAPT: Contrastive Pre-Training for Learning Denoised Sequence Representations
Figure 4 for CAPT: Contrastive Pre-Training for Learning Denoised Sequence Representations
Viaarxiv icon