Picture for Zihao Wang

Zihao Wang

Michael Pokorny

JARVIS-VLA: Post-Training Large-Scale Vision Language Models to Play Visual Games with Keyboards and Mouse

Add code
Mar 20, 2025
Viaarxiv icon

YuE: Scaling Open Foundation Models for Long-Form Music Generation

Add code
Mar 11, 2025
Viaarxiv icon

LightGen: Efficient Image Generation through Knowledge Distillation and Direct Preference Optimization

Add code
Mar 11, 2025
Viaarxiv icon

UQABench: Evaluating User Embedding for Prompting LLMs in Personalized Question Answering

Add code
Feb 26, 2025
Viaarxiv icon

Does Editing Provide Evidence for Localization?

Add code
Feb 19, 2025
Viaarxiv icon

A Comprehensive Survey on Generative AI for Video-to-Music Generation

Add code
Feb 18, 2025
Viaarxiv icon

Unlocking Scaling Law in Industrial Recommendation Systems with a Three-step Paradigm based Large User Model

Add code
Feb 12, 2025
Viaarxiv icon

LIR-LIVO: A Lightweight,Robust LiDAR/Vision/Inertial Odometry with Illumination-Resilient Deep Features

Add code
Feb 12, 2025
Viaarxiv icon

Encrypted Large Model Inference: The Equivariant Encryption Paradigm

Add code
Feb 03, 2025
Viaarxiv icon

Top Ten Challenges Towards Agentic Neural Graph Databases

Add code
Jan 24, 2025
Viaarxiv icon