Picture for Xinhao Li

Xinhao Li

Online Video Understanding: A Comprehensive Benchmark and Memory-Augmented Method

Add code
Dec 31, 2024
Figure 1 for Online Video Understanding: A Comprehensive Benchmark and Memory-Augmented Method
Figure 2 for Online Video Understanding: A Comprehensive Benchmark and Memory-Augmented Method
Figure 3 for Online Video Understanding: A Comprehensive Benchmark and Memory-Augmented Method
Figure 4 for Online Video Understanding: A Comprehensive Benchmark and Memory-Augmented Method
Viaarxiv icon

VideoChat-Flash: Hierarchical Compression for Long-Context Video Modeling

Add code
Dec 31, 2024
Viaarxiv icon

Fine-grained Video-Text Retrieval: A New Benchmark and Method

Add code
Dec 31, 2024
Viaarxiv icon

Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment

Add code
Dec 26, 2024
Viaarxiv icon

Towards Open-Vocabulary Video Semantic Segmentation

Add code
Dec 12, 2024
Viaarxiv icon

TimeSuite: Improving MLLMs for Long Video Understanding via Grounded Tuning

Add code
Oct 25, 2024
Viaarxiv icon

VideoEval: Comprehensive Benchmark Suite for Low-Cost Evaluation of Video Foundation Model

Add code
Jul 09, 2024
Viaarxiv icon

Learning to (Learn at Test Time): RNNs with Expressive Hidden States

Add code
Jul 05, 2024
Viaarxiv icon

Modelling the 5G Energy Consumption using Real-world Data: Energy Fingerprint is All You Need

Add code
Jun 13, 2024
Viaarxiv icon

InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding

Add code
Mar 22, 2024
Viaarxiv icon