Picture for Yuhao Chen

Yuhao Chen

EvoPrune: Early-Stage Visual Token Pruning for Efficient MLLMs

Add code
Mar 04, 2026
Viaarxiv icon

Implicit-Scale 3D Reconstruction for Multi-Food Volume Estimation from Monocular Images

Add code
Feb 13, 2026
Viaarxiv icon

RADAR: Benchmarking Vision-Language-Action Generalization via Real-World Dynamics, Spatial-Physical Intelligence, and Autonomous Evaluation

Add code
Feb 11, 2026
Viaarxiv icon

Training-Free Text-to-Image Compositional Food Generation via Prompt Grafting

Add code
Jan 25, 2026
Viaarxiv icon

Bridging the Discrete-Continuous Gap: Unified Multimodal Generation via Coupled Manifold Discrete Absorbing Diffusion

Add code
Jan 07, 2026
Viaarxiv icon

Stable Language Guidance for Vision-Language-Action Models

Add code
Jan 07, 2026
Viaarxiv icon

Specific Multi-emitter Identification: Theoretical Limits and Low-complexity Design

Add code
Dec 22, 2025
Viaarxiv icon

Avatar4D: Synthesizing Domain-Specific 4D Humans for Real-World Pose Estimation

Add code
Dec 18, 2025
Viaarxiv icon

Composite Classifier-Free Guidance for Multi-Modal Conditioning in Wind Dynamics Super-Resolution

Add code
Dec 13, 2025
Figure 1 for Composite Classifier-Free Guidance for Multi-Modal Conditioning in Wind Dynamics Super-Resolution
Figure 2 for Composite Classifier-Free Guidance for Multi-Modal Conditioning in Wind Dynamics Super-Resolution
Figure 3 for Composite Classifier-Free Guidance for Multi-Modal Conditioning in Wind Dynamics Super-Resolution
Figure 4 for Composite Classifier-Free Guidance for Multi-Modal Conditioning in Wind Dynamics Super-Resolution
Viaarxiv icon

Food Image Generation on Multi-Noun Categories

Add code
Dec 09, 2025
Viaarxiv icon