Picture for Hao Chen

Hao Chen

Standards and Mobility Innovation Laboratory - Samsung Research America

DeskVision: Large Scale Desktop Region Captioning for Advanced GUI Agents

Add code
Mar 14, 2025
Viaarxiv icon

Prototype-Guided Cross-Modal Knowledge Enhancement for Adaptive Survival Prediction

Add code
Mar 13, 2025
Viaarxiv icon

HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model

Add code
Mar 13, 2025
Viaarxiv icon

SciVerse: Unveiling the Knowledge Comprehension and Visual Reasoning of LMMs on Multi-modal Scientific Problems

Add code
Mar 13, 2025
Viaarxiv icon

Pre-trained Models Succeed in Medical Imaging with Representation Similarity Degradation

Add code
Mar 11, 2025
Viaarxiv icon

SegAgent: Exploring Pixel Understanding Capabilities in MLLMs by Imitating Human Annotator Trajectories

Add code
Mar 11, 2025
Viaarxiv icon

ACE: Concept Editing in Diffusion Models without Performance Degradation

Add code
Mar 11, 2025
Viaarxiv icon

Robust Latent Matters: Boosting Image Generation with Sampling Error

Add code
Mar 11, 2025
Viaarxiv icon

Towards Large-scale Chemical Reaction Image Parsing via a Multimodal Large Language Model

Add code
Mar 11, 2025
Viaarxiv icon

Weakly Supervised Convolutional Dictionary Learning with Shared and Discriminative Components for Classification

Add code
Mar 11, 2025
Viaarxiv icon