Picture for Xu Cao

Xu Cao

SocialGesture: Delving into Multi-person Gesture Understanding

Add code
Apr 03, 2025
Viaarxiv icon

STAMICS: Splat, Track And Map with Integrated Consistency and Semantics for Dense RGB-D SLAM

Add code
Mar 27, 2025
Viaarxiv icon

A Survey of Embodied AI in Healthcare: Techniques, Applications, and Opportunities

Add code
Jan 13, 2025
Figure 1 for A Survey of Embodied AI in Healthcare: Techniques, Applications, and Opportunities
Figure 2 for A Survey of Embodied AI in Healthcare: Techniques, Applications, and Opportunities
Figure 3 for A Survey of Embodied AI in Healthcare: Techniques, Applications, and Opportunities
Figure 4 for A Survey of Embodied AI in Healthcare: Techniques, Applications, and Opportunities
Viaarxiv icon

Medical Video Generation for Disease Progression Simulation

Add code
Nov 18, 2024
Viaarxiv icon

AnyECG: Foundational Models for Electrocardiogram Analysis

Add code
Nov 17, 2024
Figure 1 for AnyECG: Foundational Models for Electrocardiogram Analysis
Figure 2 for AnyECG: Foundational Models for Electrocardiogram Analysis
Figure 3 for AnyECG: Foundational Models for Electrocardiogram Analysis
Figure 4 for AnyECG: Foundational Models for Electrocardiogram Analysis
Viaarxiv icon

On-Board Vision-Language Models for Personalized Autonomous Vehicle Motion Control: System Design and Real-World Validation

Add code
Nov 17, 2024
Figure 1 for On-Board Vision-Language Models for Personalized Autonomous Vehicle Motion Control: System Design and Real-World Validation
Figure 2 for On-Board Vision-Language Models for Personalized Autonomous Vehicle Motion Control: System Design and Real-World Validation
Figure 3 for On-Board Vision-Language Models for Personalized Autonomous Vehicle Motion Control: System Design and Real-World Validation
Figure 4 for On-Board Vision-Language Models for Personalized Autonomous Vehicle Motion Control: System Design and Real-World Validation
Viaarxiv icon

MpoxVLM: A Vision-Language Model for Diagnosing Skin Lesions from Mpox Virus Infection

Add code
Nov 16, 2024
Figure 1 for MpoxVLM: A Vision-Language Model for Diagnosing Skin Lesions from Mpox Virus Infection
Figure 2 for MpoxVLM: A Vision-Language Model for Diagnosing Skin Lesions from Mpox Virus Infection
Figure 3 for MpoxVLM: A Vision-Language Model for Diagnosing Skin Lesions from Mpox Virus Infection
Figure 4 for MpoxVLM: A Vision-Language Model for Diagnosing Skin Lesions from Mpox Virus Infection
Viaarxiv icon

TrialBench: Multi-Modal Artificial Intelligence-Ready Clinical Trial Datasets

Add code
Jun 30, 2024
Figure 1 for TrialBench: Multi-Modal Artificial Intelligence-Ready Clinical Trial Datasets
Figure 2 for TrialBench: Multi-Modal Artificial Intelligence-Ready Clinical Trial Datasets
Figure 3 for TrialBench: Multi-Modal Artificial Intelligence-Ready Clinical Trial Datasets
Figure 4 for TrialBench: Multi-Modal Artificial Intelligence-Ready Clinical Trial Datasets
Viaarxiv icon

MM-SpuBench: Towards Better Understanding of Spurious Biases in Multimodal LLMs

Add code
Jun 24, 2024
Figure 1 for MM-SpuBench: Towards Better Understanding of Spurious Biases in Multimodal LLMs
Figure 2 for MM-SpuBench: Towards Better Understanding of Spurious Biases in Multimodal LLMs
Figure 3 for MM-SpuBench: Towards Better Understanding of Spurious Biases in Multimodal LLMs
Figure 4 for MM-SpuBench: Towards Better Understanding of Spurious Biases in Multimodal LLMs
Viaarxiv icon

What is the Visual Cognition Gap between Humans and Multimodal LLMs?

Add code
Jun 14, 2024
Viaarxiv icon