Picture for Wei-Shi Zheng

Wei-Shi Zheng

ObjEmbed: Towards Universal Multimodal Object Embeddings

Add code
Feb 03, 2026
Viaarxiv icon

Refer-Agent: A Collaborative Multi-Agent System with Reasoning and Reflection for Referring Video Object Segmentation

Add code
Feb 03, 2026
Viaarxiv icon

ReViP: Reducing False Completion in Vision-Language-Action Models with Vision-Proprioception Rebalance

Add code
Jan 23, 2026
Viaarxiv icon

DCAC: Dynamic Class-Aware Cache Creates Stronger Out-of-Distribution Detectors

Add code
Jan 18, 2026
Viaarxiv icon

Learning Whole-Body Human-Humanoid Interaction from Human-Human Demonstrations

Add code
Jan 14, 2026
Viaarxiv icon

ProEdit: Inversion-based Editing From Prompts Done Right

Add code
Dec 26, 2025
Figure 1 for ProEdit: Inversion-based Editing From Prompts Done Right
Figure 2 for ProEdit: Inversion-based Editing From Prompts Done Right
Figure 3 for ProEdit: Inversion-based Editing From Prompts Done Right
Figure 4 for ProEdit: Inversion-based Editing From Prompts Done Right
Viaarxiv icon

WeDetect: Fast Open-Vocabulary Object Detection as Retrieval

Add code
Dec 13, 2025
Viaarxiv icon

IRG-MotionLLM: Interleaving Motion Generation, Assessment and Refinement for Text-to-Motion Generation

Add code
Dec 11, 2025
Viaarxiv icon

ZeroDexGrasp: Zero-Shot Task-Oriented Dexterous Grasp Synthesis with Prompt-Based Multi-Stage Semantic Reasoning

Add code
Nov 17, 2025
Viaarxiv icon

OmniDexGrasp: Generalizable Dexterous Grasping via Foundation Model and Force Feedback

Add code
Oct 27, 2025
Figure 1 for OmniDexGrasp: Generalizable Dexterous Grasping via Foundation Model and Force Feedback
Figure 2 for OmniDexGrasp: Generalizable Dexterous Grasping via Foundation Model and Force Feedback
Figure 3 for OmniDexGrasp: Generalizable Dexterous Grasping via Foundation Model and Force Feedback
Figure 4 for OmniDexGrasp: Generalizable Dexterous Grasping via Foundation Model and Force Feedback
Viaarxiv icon