Picture for Ming Zhang

Ming Zhang

Muse: Towards Reproducible Long-Form Song Generation with Fine-Grained Style Control

Add code
Jan 08, 2026
Viaarxiv icon

OxygenREC: An Instruction-Following Generative Framework for E-commerce Recommendation

Add code
Dec 31, 2025
Viaarxiv icon

VSA:Visual-Structural Alignment for UI-to-Code

Add code
Dec 23, 2025
Figure 1 for VSA:Visual-Structural Alignment for UI-to-Code
Figure 2 for VSA:Visual-Structural Alignment for UI-to-Code
Figure 3 for VSA:Visual-Structural Alignment for UI-to-Code
Viaarxiv icon

Modular Layout Synthesis (MLS): Front-end Code via Structure Normalization and Constrained Generation

Add code
Dec 22, 2025
Viaarxiv icon

Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm

Add code
Nov 06, 2025
Figure 1 for Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm
Figure 2 for Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm
Figure 3 for Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm
Figure 4 for Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm
Viaarxiv icon

A Survey on Efficient Large Language Model Training: From Data-centric Perspectives

Add code
Oct 29, 2025
Viaarxiv icon

Automated Genomic Interpretation via Concept Bottleneck Models for Medical Robotics

Add code
Oct 02, 2025
Viaarxiv icon

From Scores to Preferences: Redefining MOS Benchmarking for Speech Quality Reward Modeling

Add code
Oct 01, 2025
Viaarxiv icon

MDAR: A Multi-scene Dynamic Audio Reasoning Benchmark

Add code
Sep 26, 2025
Viaarxiv icon

Type-Compliant Adaptation Cascades: Adapting Programmatic LM Workflows to Data

Add code
Aug 25, 2025
Viaarxiv icon