Picture for Haobo Yuan

Haobo Yuan

Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos

Add code
Jan 07, 2025
Figure 1 for Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos
Figure 2 for Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos
Figure 3 for Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos
Figure 4 for Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos
Viaarxiv icon

LLAVADI: What Matters For Multimodal Large Language Models Distillation

Add code
Jul 28, 2024
Viaarxiv icon

OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and Understanding

Add code
Jun 27, 2024
Viaarxiv icon

Mamba or RWKV: Exploring High-Quality and High-Efficiency Segment Anything Model

Add code
Jun 27, 2024
Viaarxiv icon

Point Could Mamba: Point Cloud Learning via State Space Model

Add code
Mar 01, 2024
Viaarxiv icon

OMG-Seg: Is One Model Good Enough For All Segmentation?

Add code
Jan 18, 2024
Viaarxiv icon

RAP-SAM: Towards Real-Time All-Purpose Segment Anything

Add code
Jan 18, 2024
Viaarxiv icon

Open-Vocabulary SAM: Segment and Recognize Twenty-thousand Classes Interactively

Add code
Jan 05, 2024
Viaarxiv icon

Neural Collapse Terminus: A Unified Solution for Class Incremental Learning and Its Variants

Add code
Aug 03, 2023
Viaarxiv icon

Towards Open Vocabulary Learning: A Survey

Add code
Jul 06, 2023
Figure 1 for Towards Open Vocabulary Learning: A Survey
Figure 2 for Towards Open Vocabulary Learning: A Survey
Figure 3 for Towards Open Vocabulary Learning: A Survey
Figure 4 for Towards Open Vocabulary Learning: A Survey
Viaarxiv icon