Picture for Huchuan Lu

Huchuan Lu

Learning Universal Features for Generalizable Image Forgery Localization

Add code
Apr 10, 2025
Viaarxiv icon

EffOWT: Transfer Visual Language Models to Open-World Tracking Efficiently and Effectively

Add code
Apr 09, 2025
Viaarxiv icon

DefMamba: Deformable Visual State Space Model

Add code
Apr 08, 2025
Viaarxiv icon

EvMic: Event-based Non-contact sound recovery from effective spatial-temporal modeling

Add code
Apr 03, 2025
Viaarxiv icon

LATex: Leveraging Attribute-based Text Knowledge for Aerial-Ground Person Re-Identification

Add code
Mar 31, 2025
Viaarxiv icon

Towards Physically Plausible Video Generation via VLM Planning

Add code
Mar 30, 2025
Viaarxiv icon

Mono2Stereo: A Benchmark and Empirical Study for Stereo Conversion

Add code
Mar 28, 2025
Viaarxiv icon

IDEA: Inverted Text with Cooperative Deformable Aggregation for Multi-modal Object Re-Identification

Add code
Mar 13, 2025
Viaarxiv icon

CineMaster: A 3D-Aware and Controllable Framework for Cinematic Text-to-Video Generation

Add code
Feb 12, 2025
Viaarxiv icon

EVEv2: Improved Baselines for Encoder-Free Vision-Language Models

Add code
Feb 10, 2025
Figure 1 for EVEv2: Improved Baselines for Encoder-Free Vision-Language Models
Figure 2 for EVEv2: Improved Baselines for Encoder-Free Vision-Language Models
Figure 3 for EVEv2: Improved Baselines for Encoder-Free Vision-Language Models
Figure 4 for EVEv2: Improved Baselines for Encoder-Free Vision-Language Models
Viaarxiv icon