Picture for Xulei Yang

Xulei Yang

Patch-as-Decodable-Token: Towards Unified Multi-Modal Vision Tasks in MLLMs

Add code
Oct 02, 2025
Figure 1 for Patch-as-Decodable-Token: Towards Unified Multi-Modal Vision Tasks in MLLMs
Figure 2 for Patch-as-Decodable-Token: Towards Unified Multi-Modal Vision Tasks in MLLMs
Figure 3 for Patch-as-Decodable-Token: Towards Unified Multi-Modal Vision Tasks in MLLMs
Figure 4 for Patch-as-Decodable-Token: Towards Unified Multi-Modal Vision Tasks in MLLMs
Viaarxiv icon

Integrating SAM Supervision for 3D Weakly Supervised Point Cloud Segmentation

Add code
Aug 27, 2025
Figure 1 for Integrating SAM Supervision for 3D Weakly Supervised Point Cloud Segmentation
Figure 2 for Integrating SAM Supervision for 3D Weakly Supervised Point Cloud Segmentation
Figure 3 for Integrating SAM Supervision for 3D Weakly Supervised Point Cloud Segmentation
Figure 4 for Integrating SAM Supervision for 3D Weakly Supervised Point Cloud Segmentation
Viaarxiv icon

FOCUS: Frequency-Optimized Conditioning of DiffUSion Models for mitigating catastrophic forgetting during Test-Time Adaptation

Add code
Aug 20, 2025
Viaarxiv icon

AD-FM: Multimodal LLMs for Anomaly Detection via Multi-Stage Reasoning and Fine-Grained Reward Optimization

Add code
Aug 06, 2025
Viaarxiv icon

Exploring Active Learning for Label-Efficient Training of Semantic Neural Radiance Field

Add code
Jul 23, 2025
Viaarxiv icon

Zero-Shot 3D Visual Grounding from Vision-Language Models

Add code
May 28, 2025
Figure 1 for Zero-Shot 3D Visual Grounding from Vision-Language Models
Figure 2 for Zero-Shot 3D Visual Grounding from Vision-Language Models
Figure 3 for Zero-Shot 3D Visual Grounding from Vision-Language Models
Figure 4 for Zero-Shot 3D Visual Grounding from Vision-Language Models
Viaarxiv icon

OccLE: Label-Efficient 3D Semantic Occupancy Prediction

Add code
May 27, 2025
Viaarxiv icon

How Do Images Align and Complement LiDAR? Towards a Harmonized Multi-modal 3D Panoptic Segmentation

Add code
May 25, 2025
Viaarxiv icon

Robust Distribution Alignment for Industrial Anomaly Detection under Distribution Shift

Add code
Mar 19, 2025
Viaarxiv icon

Multi-View Industrial Anomaly Detection with Epipolar Constrained Cross-View Fusion

Add code
Mar 14, 2025
Figure 1 for Multi-View Industrial Anomaly Detection with Epipolar Constrained Cross-View Fusion
Figure 2 for Multi-View Industrial Anomaly Detection with Epipolar Constrained Cross-View Fusion
Figure 3 for Multi-View Industrial Anomaly Detection with Epipolar Constrained Cross-View Fusion
Figure 4 for Multi-View Industrial Anomaly Detection with Epipolar Constrained Cross-View Fusion
Viaarxiv icon