Picture for Shiguang Shan

Shiguang Shan

REVAL: A Comprehension Evaluation on Reliability and Values of Large Vision-Language Models

Add code
Mar 20, 2025
Viaarxiv icon

HIS-GPT: Towards 3D Human-In-Scene Multimodal Understanding

Add code
Mar 17, 2025
Viaarxiv icon

Decoupled Doubly Contrastive Learning for Cross Domain Facial Action Unit Detection

Add code
Mar 12, 2025
Viaarxiv icon

Dynamically evolving segment anything model with continuous learning for medical image segmentation

Add code
Mar 08, 2025
Viaarxiv icon

MATS: An Audio Language Model under Text-only Supervision

Add code
Feb 20, 2025
Viaarxiv icon

G2PDiffusion: Genotype-to-Phenotype Prediction with Diffusion Models

Add code
Feb 07, 2025
Viaarxiv icon

M$^3$oralBench: A MultiModal Moral Benchmark for LVLMs

Add code
Dec 30, 2024
Viaarxiv icon

Multi-P$^2$A: A Multi-perspective Benchmark on Privacy Assessment for Large Vision-Language Models

Add code
Dec 27, 2024
Viaarxiv icon

RefHCM: A Unified Model for Referring Perceptions in Human-Centric Scenarios

Add code
Dec 19, 2024
Figure 1 for RefHCM: A Unified Model for Referring Perceptions in Human-Centric Scenarios
Figure 2 for RefHCM: A Unified Model for Referring Perceptions in Human-Centric Scenarios
Figure 3 for RefHCM: A Unified Model for Referring Perceptions in Human-Centric Scenarios
Figure 4 for RefHCM: A Unified Model for Referring Perceptions in Human-Centric Scenarios
Viaarxiv icon

Autoregressive Video Generation without Vector Quantization

Add code
Dec 18, 2024
Viaarxiv icon