Picture for Yingqing He

Yingqing He

ModelGrow: Continual Text-to-Video Pre-training with Model Expansion and Language Understanding Enhancement

Add code
Dec 25, 2024
Viaarxiv icon

Large Motion Video Autoencoding with Cross-modal Video VAE

Add code
Dec 23, 2024
Viaarxiv icon

VideoDPO: Omni-Preference Alignment for Video Diffusion Generation

Add code
Dec 18, 2024
Figure 1 for VideoDPO: Omni-Preference Alignment for Video Diffusion Generation
Figure 2 for VideoDPO: Omni-Preference Alignment for Video Diffusion Generation
Figure 3 for VideoDPO: Omni-Preference Alignment for Video Diffusion Generation
Figure 4 for VideoDPO: Omni-Preference Alignment for Video Diffusion Generation
Viaarxiv icon

HiPrompt: Tuning-free Higher-Resolution Generation with Hierarchical MLLM Prompts

Add code
Sep 04, 2024
Figure 1 for HiPrompt: Tuning-free Higher-Resolution Generation with Hierarchical MLLM Prompts
Figure 2 for HiPrompt: Tuning-free Higher-Resolution Generation with Hierarchical MLLM Prompts
Figure 3 for HiPrompt: Tuning-free Higher-Resolution Generation with Hierarchical MLLM Prompts
Figure 4 for HiPrompt: Tuning-free Higher-Resolution Generation with Hierarchical MLLM Prompts
Viaarxiv icon

MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions

Add code
Jul 30, 2024
Figure 1 for MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions
Figure 2 for MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions
Figure 3 for MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions
Figure 4 for MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions
Viaarxiv icon

FreeTraj: Tuning-Free Trajectory Control in Video Diffusion Models

Add code
Jun 24, 2024
Viaarxiv icon

Follow-Your-Emoji: Fine-Controllable and Expressive Freestyle Portrait Animation

Add code
Jun 04, 2024
Figure 1 for Follow-Your-Emoji: Fine-Controllable and Expressive Freestyle Portrait Animation
Figure 2 for Follow-Your-Emoji: Fine-Controllable and Expressive Freestyle Portrait Animation
Figure 3 for Follow-Your-Emoji: Fine-Controllable and Expressive Freestyle Portrait Animation
Figure 4 for Follow-Your-Emoji: Fine-Controllable and Expressive Freestyle Portrait Animation
Viaarxiv icon

LLMs Meet Multimodal Generation and Editing: A Survey

Add code
May 29, 2024
Viaarxiv icon

Follow-Your-Click: Open-domain Regional Image Animation via Short Prompts

Add code
Mar 13, 2024
Figure 1 for Follow-Your-Click: Open-domain Regional Image Animation via Short Prompts
Figure 2 for Follow-Your-Click: Open-domain Regional Image Animation via Short Prompts
Figure 3 for Follow-Your-Click: Open-domain Regional Image Animation via Short Prompts
Figure 4 for Follow-Your-Click: Open-domain Regional Image Animation via Short Prompts
Viaarxiv icon

Seeing and Hearing: Open-domain Visual-Audio Generation with Diffusion Latent Aligners

Add code
Feb 27, 2024
Figure 1 for Seeing and Hearing: Open-domain Visual-Audio Generation with Diffusion Latent Aligners
Figure 2 for Seeing and Hearing: Open-domain Visual-Audio Generation with Diffusion Latent Aligners
Figure 3 for Seeing and Hearing: Open-domain Visual-Audio Generation with Diffusion Latent Aligners
Figure 4 for Seeing and Hearing: Open-domain Visual-Audio Generation with Diffusion Latent Aligners
Viaarxiv icon