Picture for Lu Qi

Lu Qi

Decouple and Track: Benchmarking and Improving Video Diffusion Transformers for Motion Transfer

Add code
Mar 21, 2025
Viaarxiv icon

Unified Dense Prediction of Video Diffusion

Add code
Mar 12, 2025
Viaarxiv icon

Controllable 3D Outdoor Scene Generation via Scene Graphs

Add code
Mar 10, 2025
Viaarxiv icon

RobuRCDet: Enhancing Robustness of Radar-Camera Fusion in Bird's Eye View for 3D Object Detection

Add code
Feb 18, 2025
Viaarxiv icon

UMC: Unified Resilient Controller for Legged Robots with Joint Malfunctions

Add code
Feb 05, 2025
Viaarxiv icon

VideoAuteur: Towards Long Narrative Video Generation

Add code
Jan 10, 2025
Viaarxiv icon

Are They the Same? Exploring Visual Correspondence Shortcomings of Multimodal LLMs

Add code
Jan 08, 2025
Viaarxiv icon

Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos

Add code
Jan 07, 2025
Figure 1 for Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos
Figure 2 for Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos
Figure 3 for Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos
Figure 4 for Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos
Viaarxiv icon

VinT-6D: A Large-Scale Object-in-hand Dataset from Vision, Touch and Proprioception

Add code
Jan 06, 2025
Figure 1 for VinT-6D: A Large-Scale Object-in-hand Dataset from Vision, Touch and Proprioception
Figure 2 for VinT-6D: A Large-Scale Object-in-hand Dataset from Vision, Touch and Proprioception
Figure 3 for VinT-6D: A Large-Scale Object-in-hand Dataset from Vision, Touch and Proprioception
Figure 4 for VinT-6D: A Large-Scale Object-in-hand Dataset from Vision, Touch and Proprioception
Viaarxiv icon

SyncVIS: Synchronized Video Instance Segmentation

Add code
Dec 01, 2024
Figure 1 for SyncVIS: Synchronized Video Instance Segmentation
Figure 2 for SyncVIS: Synchronized Video Instance Segmentation
Figure 3 for SyncVIS: Synchronized Video Instance Segmentation
Figure 4 for SyncVIS: Synchronized Video Instance Segmentation
Viaarxiv icon