Picture for Yuda Xiong

Yuda Xiong

Detect Anything via Next Point Prediction

Add code
Oct 14, 2025
Viaarxiv icon

Referring to Any Person

Add code
Mar 11, 2025
Figure 1 for Referring to Any Person
Figure 2 for Referring to Any Person
Figure 3 for Referring to Any Person
Figure 4 for Referring to Any Person
Viaarxiv icon

ChatRex: Taming Multimodal LLM for Joint Perception and Understanding

Add code
Dec 02, 2024
Figure 1 for ChatRex: Taming Multimodal LLM for Joint Perception and Understanding
Figure 2 for ChatRex: Taming Multimodal LLM for Joint Perception and Understanding
Figure 3 for ChatRex: Taming Multimodal LLM for Joint Perception and Understanding
Figure 4 for ChatRex: Taming Multimodal LLM for Joint Perception and Understanding
Viaarxiv icon

DINO-X: A Unified Vision Model for Open-World Object Detection and Understanding

Add code
Nov 21, 2024
Figure 1 for DINO-X: A Unified Vision Model for Open-World Object Detection and Understanding
Figure 2 for DINO-X: A Unified Vision Model for Open-World Object Detection and Understanding
Figure 3 for DINO-X: A Unified Vision Model for Open-World Object Detection and Understanding
Figure 4 for DINO-X: A Unified Vision Model for Open-World Object Detection and Understanding
Viaarxiv icon

Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection

Add code
May 16, 2024
Figure 1 for Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection
Figure 2 for Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection
Figure 3 for Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection
Figure 4 for Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection
Viaarxiv icon