Picture for Ye Wang

Ye Wang

Perry

EasyMimic: A Low-Cost Framework for Robot Imitation Learning from Human Videos

Add code
Feb 12, 2026
Viaarxiv icon

Rethinking Visual-Language-Action Model Scaling: Alignment, Mixture, and Regularization

Add code
Feb 10, 2026
Viaarxiv icon

Understanding Dynamic Compute Allocation in Recurrent Transformers

Add code
Feb 09, 2026
Viaarxiv icon

Composable Visual Tokenizers with Generator-Free Diagnostics of Learnability

Add code
Feb 03, 2026
Viaarxiv icon

Unlocking Large Audio-Language Models for Interactive Language Learning

Add code
Jan 21, 2026
Viaarxiv icon

Comparative Study of Large Language Models on Chinese Film Script Continuation: An Empirical Analysis Based on GPT-5.2 and Qwen-Max

Add code
Jan 21, 2026
Viaarxiv icon

Being-H0.5: Scaling Human-Centric Robot Learning for Cross-Embodiment Generalization

Add code
Jan 19, 2026
Viaarxiv icon

Role-Playing Agents Driven by Large Language Models: Current Status, Challenges, and Future Trends

Add code
Jan 15, 2026
Viaarxiv icon

Spatial-Aware VLA Pretraining through Visual-Physical Alignment from Human Videos

Add code
Dec 15, 2025
Viaarxiv icon

FreeControl: Efficient, Training-Free Structural Control via One-Step Attention Extraction

Add code
Nov 07, 2025
Figure 1 for FreeControl: Efficient, Training-Free Structural Control via One-Step Attention Extraction
Figure 2 for FreeControl: Efficient, Training-Free Structural Control via One-Step Attention Extraction
Figure 3 for FreeControl: Efficient, Training-Free Structural Control via One-Step Attention Extraction
Figure 4 for FreeControl: Efficient, Training-Free Structural Control via One-Step Attention Extraction
Viaarxiv icon