Picture for Haobo Yuan

Haobo Yuan

PVUW 2025 Challenge Report: Advances in Pixel-level Understanding of Complex Videos in the Wild

Add code
Apr 15, 2025
Viaarxiv icon

An Empirical Study of GPT-4o Image Generation Capabilities

Add code
Apr 08, 2025
Viaarxiv icon

4th PVUW MeViS 3rd Place Report: Sa2VA

Add code
Apr 01, 2025
Viaarxiv icon

Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos

Add code
Jan 07, 2025
Figure 1 for Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos
Figure 2 for Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos
Figure 3 for Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos
Figure 4 for Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos
Viaarxiv icon

LLAVADI: What Matters For Multimodal Large Language Models Distillation

Add code
Jul 28, 2024
Viaarxiv icon

Mamba or RWKV: Exploring High-Quality and High-Efficiency Segment Anything Model

Add code
Jun 27, 2024
Viaarxiv icon

OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and Understanding

Add code
Jun 27, 2024
Figure 1 for OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and Understanding
Figure 2 for OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and Understanding
Figure 3 for OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and Understanding
Figure 4 for OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and Understanding
Viaarxiv icon

Point Could Mamba: Point Cloud Learning via State Space Model

Add code
Mar 01, 2024
Viaarxiv icon

RAP-SAM: Towards Real-Time All-Purpose Segment Anything

Add code
Jan 18, 2024
Viaarxiv icon

OMG-Seg: Is One Model Good Enough For All Segmentation?

Add code
Jan 18, 2024
Figure 1 for OMG-Seg: Is One Model Good Enough For All Segmentation?
Figure 2 for OMG-Seg: Is One Model Good Enough For All Segmentation?
Figure 3 for OMG-Seg: Is One Model Good Enough For All Segmentation?
Figure 4 for OMG-Seg: Is One Model Good Enough For All Segmentation?
Viaarxiv icon