Picture for Limin Wang

Limin Wang

VideoChat-R1: Enhancing Spatio-Temporal Perception via Reinforcement Fine-Tuning

Add code
Apr 10, 2025
Viaarxiv icon

DDT: Decoupled Diffusion Transformer

Add code
Apr 09, 2025
Viaarxiv icon

MiLA: Multi-view Intensive-fidelity Long-term Video Generation World Model for Autonomous Driving

Add code
Mar 20, 2025
Viaarxiv icon

Make Your Training Flexible: Towards Deployment-Efficient Video Models

Add code
Mar 18, 2025
Viaarxiv icon

History-Aware Transformation of ReID Features for Multiple Object Tracking

Add code
Mar 16, 2025
Viaarxiv icon

VideoMAP: Toward Scalable Mamba-based Video Autoregressive Pretraining

Add code
Mar 16, 2025
Viaarxiv icon

An Egocentric Vision-Language Model based Portable Real-time Smart Assistant

Add code
Mar 06, 2025
Viaarxiv icon

Modeling Fine-Grained Hand-Object Dynamics for Egocentric Video Representation Learning

Add code
Mar 02, 2025
Viaarxiv icon

Learning Human Skill Generators at Key-Step Levels

Add code
Feb 12, 2025
Viaarxiv icon

InternVideo2.5: Empowering Video MLLMs with Long and Rich Context Modeling

Add code
Jan 21, 2025
Viaarxiv icon