Picture for Hang Xu

Hang Xu

EDEN: Enhanced Diffusion for High-quality Large-motion Video Frame Interpolation

Add code
Mar 20, 2025
Viaarxiv icon

ZO2: Scalable Zeroth-Order Fine-Tuning for Extremely Large Language Models with Limited GPU Memory

Add code
Mar 16, 2025
Viaarxiv icon

Open-Sora 2.0: Training a Commercial-Level Video Generation Model in $200k

Add code
Mar 12, 2025
Viaarxiv icon

SemHiTok: A Unified Image Tokenizer via Semantic-Guided Hierarchical Codebook for Multimodal Understanding and Generation

Add code
Mar 09, 2025
Viaarxiv icon

Can Atomic Step Decomposition Enhance the Self-structured Reasoning of Multimodal Large Models?

Add code
Mar 08, 2025
Viaarxiv icon

Towards Heisenberg limit without critical slowing down via quantum reinforcement learning

Add code
Mar 04, 2025
Viaarxiv icon

UniGS: Unified Language-Image-3D Pretraining with Gaussian Splatting

Add code
Feb 25, 2025
Viaarxiv icon

TransMamba: Fast Universal Architecture Adaption from Transformers to Mamba

Add code
Feb 21, 2025
Viaarxiv icon

VidCRAFT3: Camera, Object, and Lighting Control for Image-to-Video Generation

Add code
Feb 12, 2025
Viaarxiv icon

FreqPrior: Improving Video Diffusion Models with Frequency Filtering Gaussian Noise

Add code
Feb 05, 2025
Viaarxiv icon