Picture for Wenjun Huang

Wenjun Huang

Tell Me What to Track: Infusing Robust Language Guidance for Enhanced Referring Multi-Object Tracking

Add code
Dec 17, 2024
Viaarxiv icon

MAL: Cluster-Masked and Multi-Task Pretraining for Enhanced xLSTM Vision Performance

Add code
Dec 14, 2024
Figure 1 for MAL: Cluster-Masked and Multi-Task Pretraining for Enhanced xLSTM Vision Performance
Figure 2 for MAL: Cluster-Masked and Multi-Task Pretraining for Enhanced xLSTM Vision Performance
Figure 3 for MAL: Cluster-Masked and Multi-Task Pretraining for Enhanced xLSTM Vision Performance
Figure 4 for MAL: Cluster-Masked and Multi-Task Pretraining for Enhanced xLSTM Vision Performance
Viaarxiv icon

Multi-cam Multi-map Visual Inertial Localization: System, Validation and Dataset

Add code
Dec 05, 2024
Viaarxiv icon

Expanding Event Modality Applications through a Robust CLIP-Based Encoder

Add code
Dec 04, 2024
Viaarxiv icon

EMMA: Empowering Multi-modal Mamba with Structural and Hierarchical Alignment

Add code
Oct 08, 2024
Figure 1 for EMMA: Empowering Multi-modal Mamba with Structural and Hierarchical Alignment
Figure 2 for EMMA: Empowering Multi-modal Mamba with Structural and Hierarchical Alignment
Figure 3 for EMMA: Empowering Multi-modal Mamba with Structural and Hierarchical Alignment
Figure 4 for EMMA: Empowering Multi-modal Mamba with Structural and Hierarchical Alignment
Viaarxiv icon

VLTP: Vision-Language Guided Token Pruning for Task-Oriented Segmentation

Add code
Sep 13, 2024
Figure 1 for VLTP: Vision-Language Guided Token Pruning for Task-Oriented Segmentation
Figure 2 for VLTP: Vision-Language Guided Token Pruning for Task-Oriented Segmentation
Figure 3 for VLTP: Vision-Language Guided Token Pruning for Task-Oriented Segmentation
Figure 4 for VLTP: Vision-Language Guided Token Pruning for Task-Oriented Segmentation
Viaarxiv icon

3D-LSPTM: An Automatic Framework with 3D-Large-Scale Pretrained Model for Laryngeal Cancer Detection Using Laryngoscopic Videos

Add code
Sep 02, 2024
Viaarxiv icon

Recoverable Anonymization for Pose Estimation: A Privacy-Enhancing Approach

Add code
Sep 01, 2024
Viaarxiv icon

ML-Mamba: Efficient Multi-Modal Large Language Model Utilizing Mamba-2

Add code
Jul 29, 2024
Viaarxiv icon

EcoSense: Energy-Efficient Intelligent Sensing for In-Shore Ship Detection through Edge-Cloud Collaboration

Add code
Mar 26, 2024
Viaarxiv icon