Picture for Yiming Wang

Yiming Wang

On Large Multimodal Models as Open-World Image Classifiers

Add code
Mar 27, 2025
Viaarxiv icon

Wan: Open and Advanced Large-Scale Video Generative Models

Add code
Mar 26, 2025
Viaarxiv icon

Zero-Shot Human-Object Interaction Synthesis with Multimodal Priors

Add code
Mar 25, 2025
Viaarxiv icon

Training-Free Personalization via Retrieval and Reasoning on Fingerprints

Add code
Mar 24, 2025
Viaarxiv icon

SplatVoxel: History-Aware Novel View Streaming without Temporal Training

Add code
Mar 18, 2025
Viaarxiv icon

Free-form language-based robotic reasoning and grasping

Add code
Mar 17, 2025
Viaarxiv icon

AirExo-2: Scaling up Generalizable Robotic Imitation Learning with Low-Cost Exoskeletons

Add code
Mar 05, 2025
Viaarxiv icon

Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs

Add code
Mar 03, 2025
Viaarxiv icon

Sampling-Efficient Test-Time Scaling: Self-Estimating the Best-of-N Sampling in Early Decoding

Add code
Mar 03, 2025
Viaarxiv icon

MIND: Towards Immersive Psychological Healing with Multi-agent Inner Dialogue

Add code
Feb 27, 2025
Viaarxiv icon