Picture for Shilin Xu

Shilin Xu

Are They the Same? Exploring Visual Correspondence Shortcomings of Multimodal LLMs

Add code
Jan 08, 2025
Viaarxiv icon

Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos

Add code
Jan 07, 2025
Figure 1 for Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos
Figure 2 for Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos
Figure 3 for Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos
Figure 4 for Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos
Viaarxiv icon

RLRF4Rec: Reinforcement Learning from Recsys Feedback for Enhanced Recommendation Reranking

Add code
Oct 08, 2024
Viaarxiv icon

LLAVADI: What Matters For Multimodal Large Language Models Distillation

Add code
Jul 28, 2024
Viaarxiv icon

RAP-SAM: Towards Real-Time All-Purpose Segment Anything

Add code
Jan 18, 2024
Viaarxiv icon

An Open and Comprehensive Pipeline for Unified Object Grounding and Detection

Add code
Jan 05, 2024
Viaarxiv icon

DST-Det: Simple Dynamic Self-Training for Open-Vocabulary Object Detection

Add code
Oct 02, 2023
Figure 1 for DST-Det: Simple Dynamic Self-Training for Open-Vocabulary Object Detection
Figure 2 for DST-Det: Simple Dynamic Self-Training for Open-Vocabulary Object Detection
Figure 3 for DST-Det: Simple Dynamic Self-Training for Open-Vocabulary Object Detection
Figure 4 for DST-Det: Simple Dynamic Self-Training for Open-Vocabulary Object Detection
Viaarxiv icon

Towards Open Vocabulary Learning: A Survey

Add code
Jul 06, 2023
Figure 1 for Towards Open Vocabulary Learning: A Survey
Figure 2 for Towards Open Vocabulary Learning: A Survey
Figure 3 for Towards Open Vocabulary Learning: A Survey
Figure 4 for Towards Open Vocabulary Learning: A Survey
Viaarxiv icon

PanopticPartFormer++: A Unified and Decoupled View for Panoptic Part Segmentation

Add code
Jan 03, 2023
Figure 1 for PanopticPartFormer++: A Unified and Decoupled View for Panoptic Part Segmentation
Figure 2 for PanopticPartFormer++: A Unified and Decoupled View for Panoptic Part Segmentation
Figure 3 for PanopticPartFormer++: A Unified and Decoupled View for Panoptic Part Segmentation
Figure 4 for PanopticPartFormer++: A Unified and Decoupled View for Panoptic Part Segmentation
Viaarxiv icon

Panoptic-PartFormer: Learning a Unified Model for Panoptic Part Segmentation

Add code
Apr 10, 2022
Figure 1 for Panoptic-PartFormer: Learning a Unified Model for Panoptic Part Segmentation
Figure 2 for Panoptic-PartFormer: Learning a Unified Model for Panoptic Part Segmentation
Figure 3 for Panoptic-PartFormer: Learning a Unified Model for Panoptic Part Segmentation
Figure 4 for Panoptic-PartFormer: Learning a Unified Model for Panoptic Part Segmentation
Viaarxiv icon