Picture for Limin Wang

Limin Wang

Motion-Aware Generative Frame Interpolation

Add code
Jan 07, 2025
Viaarxiv icon

VideoChat-Flash: Hierarchical Compression for Long-Context Video Modeling

Add code
Dec 31, 2024
Viaarxiv icon

Fine-grained Video-Text Retrieval: A New Benchmark and Method

Add code
Dec 31, 2024
Viaarxiv icon

Online Video Understanding: A Comprehensive Benchmark and Memory-Augmented Method

Add code
Dec 31, 2024
Viaarxiv icon

Vinci: A Real-time Embodied Smart Assistant based on Egocentric Vision-Language Model

Add code
Dec 30, 2024
Viaarxiv icon

A Large-Scale Study on Video Action Dataset Condensation

Add code
Dec 30, 2024
Viaarxiv icon

Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment

Add code
Dec 26, 2024
Viaarxiv icon

LeviTor: 3D Trajectory Oriented Image-to-Video Synthesis

Add code
Dec 19, 2024
Viaarxiv icon

CG-Bench: Clue-grounded Question Answering Benchmark for Long Video Understanding

Add code
Dec 16, 2024
Figure 1 for CG-Bench: Clue-grounded Question Answering Benchmark for Long Video Understanding
Figure 2 for CG-Bench: Clue-grounded Question Answering Benchmark for Long Video Understanding
Figure 3 for CG-Bench: Clue-grounded Question Answering Benchmark for Long Video Understanding
Figure 4 for CG-Bench: Clue-grounded Question Answering Benchmark for Long Video Understanding
Viaarxiv icon

Bootstrapping Language-Guided Navigation Learning with Self-Refining Data Flywheel

Add code
Dec 11, 2024
Viaarxiv icon