Picture for Wei-Shi Zheng

Wei-Shi Zheng

Hierarchical Vision-Language Learning for Medical Out-of-Distribution Detection

Add code
Aug 25, 2025
Figure 1 for Hierarchical Vision-Language Learning for Medical Out-of-Distribution Detection
Figure 2 for Hierarchical Vision-Language Learning for Medical Out-of-Distribution Detection
Figure 3 for Hierarchical Vision-Language Learning for Medical Out-of-Distribution Detection
Viaarxiv icon

CoopDiff: Anticipating 3D Human-object Interactions via Contact-consistent Decoupled Diffusion

Add code
Aug 10, 2025
Viaarxiv icon

TypeTele: Releasing Dexterity in Teleoperation by Dexterous Manipulation Types

Add code
Jul 02, 2025
Figure 1 for TypeTele: Releasing Dexterity in Teleoperation by Dexterous Manipulation Types
Figure 2 for TypeTele: Releasing Dexterity in Teleoperation by Dexterous Manipulation Types
Figure 3 for TypeTele: Releasing Dexterity in Teleoperation by Dexterous Manipulation Types
Figure 4 for TypeTele: Releasing Dexterity in Teleoperation by Dexterous Manipulation Types
Viaarxiv icon

Chain of Methodologies: Scaling Test Time Computation without Training

Add code
Jun 08, 2025
Viaarxiv icon

Reinforcing Video Reasoning with Focused Thinking

Add code
May 30, 2025
Viaarxiv icon

Long-RVOS: A Comprehensive Benchmark for Long-term Referring Video Object Segmentation

Add code
May 19, 2025
Viaarxiv icon

ActionArt: Advancing Multimodal Large Models for Fine-Grained Human-Centric Video Understanding

Add code
Apr 25, 2025
Figure 1 for ActionArt: Advancing Multimodal Large Models for Fine-Grained Human-Centric Video Understanding
Figure 2 for ActionArt: Advancing Multimodal Large Models for Fine-Grained Human-Centric Video Understanding
Figure 3 for ActionArt: Advancing Multimodal Large Models for Fine-Grained Human-Centric Video Understanding
Figure 4 for ActionArt: Advancing Multimodal Large Models for Fine-Grained Human-Centric Video Understanding
Viaarxiv icon

PVUW 2025 Challenge Report: Advances in Pixel-level Understanding of Complex Videos in the Wild

Add code
Apr 15, 2025
Figure 1 for PVUW 2025 Challenge Report: Advances in Pixel-level Understanding of Complex Videos in the Wild
Figure 2 for PVUW 2025 Challenge Report: Advances in Pixel-level Understanding of Complex Videos in the Wild
Figure 3 for PVUW 2025 Challenge Report: Advances in Pixel-level Understanding of Complex Videos in the Wild
Figure 4 for PVUW 2025 Challenge Report: Advances in Pixel-level Understanding of Complex Videos in the Wild
Viaarxiv icon

Modeling Multiple Normal Action Representations for Error Detection in Procedural Tasks

Add code
Apr 02, 2025
Figure 1 for Modeling Multiple Normal Action Representations for Error Detection in Procedural Tasks
Figure 2 for Modeling Multiple Normal Action Representations for Error Detection in Procedural Tasks
Figure 3 for Modeling Multiple Normal Action Representations for Error Detection in Procedural Tasks
Figure 4 for Modeling Multiple Normal Action Representations for Error Detection in Procedural Tasks
Viaarxiv icon

Decoupled Distillation to Erase: A General Unlearning Method for Any Class-centric Tasks

Add code
Mar 31, 2025
Viaarxiv icon