Picture for Huchuan Lu

Huchuan Lu

IDEA: Inverted Text with Cooperative Deformable Aggregation for Multi-modal Object Re-Identification

Add code
Mar 13, 2025
Viaarxiv icon

CineMaster: A 3D-Aware and Controllable Framework for Cinematic Text-to-Video Generation

Add code
Feb 12, 2025
Viaarxiv icon

KARST: Multi-Kernel Kronecker Adaptation with Re-Scaling Transmission for Visual Classification

Add code
Feb 10, 2025
Viaarxiv icon

EVEv2: Improved Baselines for Encoder-Free Vision-Language Models

Add code
Feb 10, 2025
Figure 1 for EVEv2: Improved Baselines for Encoder-Free Vision-Language Models
Figure 2 for EVEv2: Improved Baselines for Encoder-Free Vision-Language Models
Figure 3 for EVEv2: Improved Baselines for Encoder-Free Vision-Language Models
Figure 4 for EVEv2: Improved Baselines for Encoder-Free Vision-Language Models
Viaarxiv icon

The Devil is in Temporal Token: High Quality Video Reasoning Segmentation

Add code
Jan 15, 2025
Figure 1 for The Devil is in Temporal Token: High Quality Video Reasoning Segmentation
Figure 2 for The Devil is in Temporal Token: High Quality Video Reasoning Segmentation
Figure 3 for The Devil is in Temporal Token: High Quality Video Reasoning Segmentation
Figure 4 for The Devil is in Temporal Token: High Quality Video Reasoning Segmentation
Viaarxiv icon

3UR-LLM: An End-to-End Multimodal Large Language Model for 3D Scene Understanding

Add code
Jan 14, 2025
Viaarxiv icon

AVS-Mamba: Exploring Temporal and Multi-modal Mamba for Audio-Visual Segmentation

Add code
Jan 14, 2025
Figure 1 for AVS-Mamba: Exploring Temporal and Multi-modal Mamba for Audio-Visual Segmentation
Figure 2 for AVS-Mamba: Exploring Temporal and Multi-modal Mamba for Audio-Visual Segmentation
Figure 3 for AVS-Mamba: Exploring Temporal and Multi-modal Mamba for Audio-Visual Segmentation
Figure 4 for AVS-Mamba: Exploring Temporal and Multi-modal Mamba for Audio-Visual Segmentation
Viaarxiv icon

Learning Motion and Temporal Cues for Unsupervised Video Object Segmentation

Add code
Jan 14, 2025
Figure 1 for Learning Motion and Temporal Cues for Unsupervised Video Object Segmentation
Figure 2 for Learning Motion and Temporal Cues for Unsupervised Video Object Segmentation
Figure 3 for Learning Motion and Temporal Cues for Unsupervised Video Object Segmentation
Figure 4 for Learning Motion and Temporal Cues for Unsupervised Video Object Segmentation
Viaarxiv icon

ReNeg: Learning Negative Embedding with Reward Guidance

Add code
Dec 27, 2024
Viaarxiv icon

SUTrack: Towards Simple and Unified Single Object Tracking

Add code
Dec 26, 2024
Figure 1 for SUTrack: Towards Simple and Unified Single Object Tracking
Figure 2 for SUTrack: Towards Simple and Unified Single Object Tracking
Figure 3 for SUTrack: Towards Simple and Unified Single Object Tracking
Figure 4 for SUTrack: Towards Simple and Unified Single Object Tracking
Viaarxiv icon