Picture for Shicheng Li

Shicheng Li

Lumina-Video: Efficient and Flexible Video Generation with Multi-scale Next-DiT

Add code
Feb 10, 2025
Viaarxiv icon

PunchBench: Benchmarking MLLMs in Multimodal Punchline Comprehension

Add code
Dec 16, 2024
Figure 1 for PunchBench: Benchmarking MLLMs in Multimodal Punchline Comprehension
Figure 2 for PunchBench: Benchmarking MLLMs in Multimodal Punchline Comprehension
Figure 3 for PunchBench: Benchmarking MLLMs in Multimodal Punchline Comprehension
Figure 4 for PunchBench: Benchmarking MLLMs in Multimodal Punchline Comprehension
Viaarxiv icon

TempCompass: Do Video LLMs Really Understand Videos?

Add code
Mar 01, 2024
Figure 1 for TempCompass: Do Video LLMs Really Understand Videos?
Figure 2 for TempCompass: Do Video LLMs Really Understand Videos?
Figure 3 for TempCompass: Do Video LLMs Really Understand Videos?
Figure 4 for TempCompass: Do Video LLMs Really Understand Videos?
Viaarxiv icon

TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding

Add code
Dec 04, 2023
Figure 1 for TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding
Figure 2 for TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding
Figure 3 for TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding
Figure 4 for TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding
Viaarxiv icon

VITATECS: A Diagnostic Dataset for Temporal Concept Understanding of Video-Language Models

Add code
Nov 29, 2023
Figure 1 for VITATECS: A Diagnostic Dataset for Temporal Concept Understanding of Video-Language Models
Figure 2 for VITATECS: A Diagnostic Dataset for Temporal Concept Understanding of Video-Language Models
Figure 3 for VITATECS: A Diagnostic Dataset for Temporal Concept Understanding of Video-Language Models
Figure 4 for VITATECS: A Diagnostic Dataset for Temporal Concept Understanding of Video-Language Models
Viaarxiv icon

RECALL: A Benchmark for LLMs Robustness against External Counterfactual Knowledge

Add code
Nov 14, 2023
Figure 1 for RECALL: A Benchmark for LLMs Robustness against External Counterfactual Knowledge
Figure 2 for RECALL: A Benchmark for LLMs Robustness against External Counterfactual Knowledge
Figure 3 for RECALL: A Benchmark for LLMs Robustness against External Counterfactual Knowledge
Figure 4 for RECALL: A Benchmark for LLMs Robustness against External Counterfactual Knowledge
Viaarxiv icon

FETV: A Benchmark for Fine-Grained Evaluation of Open-Domain Text-to-Video Generation

Add code
Nov 08, 2023
Figure 1 for FETV: A Benchmark for Fine-Grained Evaluation of Open-Domain Text-to-Video Generation
Figure 2 for FETV: A Benchmark for Fine-Grained Evaluation of Open-Domain Text-to-Video Generation
Figure 3 for FETV: A Benchmark for Fine-Grained Evaluation of Open-Domain Text-to-Video Generation
Figure 4 for FETV: A Benchmark for Fine-Grained Evaluation of Open-Domain Text-to-Video Generation
Viaarxiv icon

TESTA: Temporal-Spatial Token Aggregation for Long-form Video-Language Understanding

Add code
Oct 29, 2023
Viaarxiv icon

M$^3$IT: A Large-Scale Dataset towards Multi-Modal Multilingual Instruction Tuning

Add code
Jun 08, 2023
Viaarxiv icon

Optimizing Energy Efficiency in Metro Systems Under Uncertainty Disturbances Using Reinforcement Learning

Add code
Apr 27, 2023
Viaarxiv icon