Picture for David Junhao Zhang

David Junhao Zhang

ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning

Add code
Nov 07, 2024
Figure 1 for ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning
Figure 2 for ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning
Figure 3 for ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning
Figure 4 for ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning
Viaarxiv icon

MixEval-X: Any-to-Any Evaluations from Real-World Data Mixtures

Add code
Oct 17, 2024
Figure 1 for MixEval-X: Any-to-Any Evaluations from Real-World Data Mixtures
Figure 2 for MixEval-X: Any-to-Any Evaluations from Real-World Data Mixtures
Figure 3 for MixEval-X: Any-to-Any Evaluations from Real-World Data Mixtures
Figure 4 for MixEval-X: Any-to-Any Evaluations from Real-World Data Mixtures
Viaarxiv icon

Show-o: One Single Transformer to Unify Multimodal Understanding and Generation

Add code
Aug 22, 2024
Figure 1 for Show-o: One Single Transformer to Unify Multimodal Understanding and Generation
Figure 2 for Show-o: One Single Transformer to Unify Multimodal Understanding and Generation
Figure 3 for Show-o: One Single Transformer to Unify Multimodal Understanding and Generation
Figure 4 for Show-o: One Single Transformer to Unify Multimodal Understanding and Generation
Viaarxiv icon

DragAnything: Motion Control for Anything using Entity Representation

Add code
Mar 15, 2024
Viaarxiv icon

Towards A Better Metric for Text-to-Video Generation

Add code
Jan 15, 2024
Viaarxiv icon

Moonshot: Towards Controllable Video Generation and Editing with Multimodal Conditions

Add code
Jan 03, 2024
Figure 1 for Moonshot: Towards Controllable Video Generation and Editing with Multimodal Conditions
Figure 2 for Moonshot: Towards Controllable Video Generation and Editing with Multimodal Conditions
Figure 3 for Moonshot: Towards Controllable Video Generation and Editing with Multimodal Conditions
Figure 4 for Moonshot: Towards Controllable Video Generation and Editing with Multimodal Conditions
Viaarxiv icon

VideoSwap: Customized Video Subject Swapping with Interactive Semantic Point Correspondence

Add code
Dec 05, 2023
Figure 1 for VideoSwap: Customized Video Subject Swapping with Interactive Semantic Point Correspondence
Figure 2 for VideoSwap: Customized Video Subject Swapping with Interactive Semantic Point Correspondence
Figure 3 for VideoSwap: Customized Video Subject Swapping with Interactive Semantic Point Correspondence
Figure 4 for VideoSwap: Customized Video Subject Swapping with Interactive Semantic Point Correspondence
Viaarxiv icon

MotionDirector: Motion Customization of Text-to-Video Diffusion Models

Add code
Oct 12, 2023
Figure 1 for MotionDirector: Motion Customization of Text-to-Video Diffusion Models
Figure 2 for MotionDirector: Motion Customization of Text-to-Video Diffusion Models
Figure 3 for MotionDirector: Motion Customization of Text-to-Video Diffusion Models
Figure 4 for MotionDirector: Motion Customization of Text-to-Video Diffusion Models
Viaarxiv icon

Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation

Add code
Sep 27, 2023
Viaarxiv icon

Dataset Condensation via Generative Model

Add code
Sep 14, 2023
Viaarxiv icon