Picture for Di Zhang

Di Zhang

Improving Video Generation with Human Feedback

Add code
Jan 23, 2025
Viaarxiv icon

GameFactory: Creating New Games with Generative Interactive Videos

Add code
Jan 14, 2025
Viaarxiv icon

ConceptMaster: Multi-Concept Video Customization on Diffusion Transformer Models Without Test-Time Tuning

Add code
Jan 08, 2025
Viaarxiv icon

Biology Instructions: A Dataset and Benchmark for Multi-Omics Sequence Understanding Capability of Large Language Models

Add code
Dec 26, 2024
Viaarxiv icon

Owl-1: Omni World Model for Consistent Long Video Generation

Add code
Dec 12, 2024
Viaarxiv icon

Breaking the Stage Barrier: A Novel Single-Stage Approach to Long Context Extension for Large Language Models

Add code
Dec 10, 2024
Viaarxiv icon

3DTrajMaster: Mastering 3D Trajectory for Multi-Entity Motion in Video Generation

Add code
Dec 10, 2024
Viaarxiv icon

StyleMaster: Stylize Your Video with Artistic Generation and Translation

Add code
Dec 10, 2024
Figure 1 for StyleMaster: Stylize Your Video with Artistic Generation and Translation
Figure 2 for StyleMaster: Stylize Your Video with Artistic Generation and Translation
Figure 3 for StyleMaster: Stylize Your Video with Artistic Generation and Translation
Figure 4 for StyleMaster: Stylize Your Video with Artistic Generation and Translation
Viaarxiv icon

SynCamMaster: Synchronizing Multi-Camera Video Generation from Diverse Viewpoints

Add code
Dec 10, 2024
Viaarxiv icon

Critic-V: VLM Critics Help Catch VLM Errors in Multimodal Reasoning

Add code
Dec 02, 2024
Figure 1 for Critic-V: VLM Critics Help Catch VLM Errors in Multimodal Reasoning
Figure 2 for Critic-V: VLM Critics Help Catch VLM Errors in Multimodal Reasoning
Figure 3 for Critic-V: VLM Critics Help Catch VLM Errors in Multimodal Reasoning
Figure 4 for Critic-V: VLM Critics Help Catch VLM Errors in Multimodal Reasoning
Viaarxiv icon