Picture for Lu Qi

Lu Qi

VideoAuteur: Towards Long Narrative Video Generation

Add code
Jan 10, 2025
Viaarxiv icon

Are They the Same? Exploring Visual Correspondence Shortcomings of Multimodal LLMs

Add code
Jan 08, 2025
Viaarxiv icon

Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos

Add code
Jan 07, 2025
Figure 1 for Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos
Figure 2 for Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos
Figure 3 for Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos
Figure 4 for Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos
Viaarxiv icon

VinT-6D: A Large-Scale Object-in-hand Dataset from Vision, Touch and Proprioception

Add code
Jan 06, 2025
Figure 1 for VinT-6D: A Large-Scale Object-in-hand Dataset from Vision, Touch and Proprioception
Figure 2 for VinT-6D: A Large-Scale Object-in-hand Dataset from Vision, Touch and Proprioception
Figure 3 for VinT-6D: A Large-Scale Object-in-hand Dataset from Vision, Touch and Proprioception
Figure 4 for VinT-6D: A Large-Scale Object-in-hand Dataset from Vision, Touch and Proprioception
Viaarxiv icon

SyncVIS: Synchronized Video Instance Segmentation

Add code
Dec 01, 2024
Figure 1 for SyncVIS: Synchronized Video Instance Segmentation
Figure 2 for SyncVIS: Synchronized Video Instance Segmentation
Figure 3 for SyncVIS: Synchronized Video Instance Segmentation
Figure 4 for SyncVIS: Synchronized Video Instance Segmentation
Viaarxiv icon

RelationBooth: Towards Relation-Aware Customized Object Generation

Add code
Oct 30, 2024
Viaarxiv icon

Layout-your-3D: Controllable and Precise 3D Generation with 2D Blueprint

Add code
Oct 20, 2024
Viaarxiv icon

PredFormer: Transformers Are Effective Spatial-Temporal Predictive Learners

Add code
Oct 07, 2024
Viaarxiv icon

Re-boosting Self-Collaboration Parallel Prompt GAN for Unsupervised Image Restoration

Add code
Aug 17, 2024
Figure 1 for Re-boosting Self-Collaboration Parallel Prompt GAN for Unsupervised Image Restoration
Figure 2 for Re-boosting Self-Collaboration Parallel Prompt GAN for Unsupervised Image Restoration
Figure 3 for Re-boosting Self-Collaboration Parallel Prompt GAN for Unsupervised Image Restoration
Figure 4 for Re-boosting Self-Collaboration Parallel Prompt GAN for Unsupervised Image Restoration
Viaarxiv icon

LLAVADI: What Matters For Multimodal Large Language Models Distillation

Add code
Jul 28, 2024
Viaarxiv icon