Picture for Xingjian He

Xingjian He

MiniVLN: Efficient Vision-and-Language Navigation by Progressive Knowledge Distillation

Add code
Sep 27, 2024
Viaarxiv icon

LSVOS Challenge Report: Large-scale Complex and Long Video Object Segmentation

Add code
Sep 09, 2024
Figure 1 for LSVOS Challenge Report: Large-scale Complex and Long Video Object Segmentation
Figure 2 for LSVOS Challenge Report: Large-scale Complex and Long Video Object Segmentation
Figure 3 for LSVOS Challenge Report: Large-scale Complex and Long Video Object Segmentation
Figure 4 for LSVOS Challenge Report: Large-scale Complex and Long Video Object Segmentation
Viaarxiv icon

The Instance-centric Transformer for the RVOS Track of LSVOS Challenge: 3rd Place Solution

Add code
Aug 20, 2024
Viaarxiv icon

PVUW 2024 Challenge on Complex Video Understanding: Methods and Results

Add code
Jun 24, 2024
Figure 1 for PVUW 2024 Challenge on Complex Video Understanding: Methods and Results
Figure 2 for PVUW 2024 Challenge on Complex Video Understanding: Methods and Results
Figure 3 for PVUW 2024 Challenge on Complex Video Understanding: Methods and Results
Figure 4 for PVUW 2024 Challenge on Complex Video Understanding: Methods and Results
Viaarxiv icon

2nd Place Solution for MeViS Track in CVPR 2024 PVUW Workshop: Motion Expression guided Video Segmentation

Add code
Jun 20, 2024
Figure 1 for 2nd Place Solution for MeViS Track in CVPR 2024 PVUW Workshop: Motion Expression guided Video Segmentation
Figure 2 for 2nd Place Solution for MeViS Track in CVPR 2024 PVUW Workshop: Motion Expression guided Video Segmentation
Viaarxiv icon

Fuse & Calibrate: A bi-directional Vision-Language Guided Framework for Referring Image Segmentation

Add code
May 18, 2024
Viaarxiv icon

Calibration & Reconstruction: Deep Integrated Language for Referring Image Segmentation

Add code
Apr 12, 2024
Viaarxiv icon

SC-Tune: Unleashing Self-Consistent Referential Comprehension in Large Vision Language Models

Add code
Mar 20, 2024
Figure 1 for SC-Tune: Unleashing Self-Consistent Referential Comprehension in Large Vision Language Models
Figure 2 for SC-Tune: Unleashing Self-Consistent Referential Comprehension in Large Vision Language Models
Figure 3 for SC-Tune: Unleashing Self-Consistent Referential Comprehension in Large Vision Language Models
Figure 4 for SC-Tune: Unleashing Self-Consistent Referential Comprehension in Large Vision Language Models
Viaarxiv icon

Beyond Literal Descriptions: Understanding and Locating Open-World Objects Aligned with Human Intentions

Add code
Feb 17, 2024
Viaarxiv icon

Unveiling Parts Beyond Objects:Towards Finer-Granularity Referring Expression Segmentation

Add code
Dec 13, 2023
Viaarxiv icon