Picture for Yingya Zhang

Yingya Zhang

FreeScale: Unleashing the Resolution of Diffusion Models via Tuning-Free Scale Fusion

Add code
Dec 12, 2024
Viaarxiv icon

Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model

Add code
Nov 28, 2024
Viaarxiv icon

PersonalVideo: High ID-Fidelity Video Customization without Dynamic and Semantic Degradation

Add code
Nov 26, 2024
Viaarxiv icon

EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Video Generation

Add code
Nov 13, 2024
Figure 1 for EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Video Generation
Figure 2 for EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Video Generation
Figure 3 for EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Video Generation
Figure 4 for EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Video Generation
Viaarxiv icon

DreamVideo-2: Zero-Shot Subject-Driven Video Customization with Precise Motion Control

Add code
Oct 17, 2024
Figure 1 for DreamVideo-2: Zero-Shot Subject-Driven Video Customization with Precise Motion Control
Figure 2 for DreamVideo-2: Zero-Shot Subject-Driven Video Customization with Precise Motion Control
Figure 3 for DreamVideo-2: Zero-Shot Subject-Driven Video Customization with Precise Motion Control
Figure 4 for DreamVideo-2: Zero-Shot Subject-Driven Video Customization with Precise Motion Control
Viaarxiv icon

EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models

Add code
Oct 10, 2024
Figure 1 for EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models
Figure 2 for EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models
Figure 3 for EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models
Figure 4 for EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models
Viaarxiv icon

FreeMask: Rethinking the Importance of Attention Masks for Zero-Shot Video Editing

Add code
Sep 30, 2024
Figure 1 for FreeMask: Rethinking the Importance of Attention Masks for Zero-Shot Video Editing
Figure 2 for FreeMask: Rethinking the Importance of Attention Masks for Zero-Shot Video Editing
Figure 3 for FreeMask: Rethinking the Importance of Attention Masks for Zero-Shot Video Editing
Figure 4 for FreeMask: Rethinking the Importance of Attention Masks for Zero-Shot Video Editing
Viaarxiv icon

S^3D-NeRF: Single-Shot Speech-Driven Neural Radiance Field for High Fidelity Talking Head Synthesis

Add code
Aug 18, 2024
Figure 1 for S^3D-NeRF: Single-Shot Speech-Driven Neural Radiance Field for High Fidelity Talking Head Synthesis
Figure 2 for S^3D-NeRF: Single-Shot Speech-Driven Neural Radiance Field for High Fidelity Talking Head Synthesis
Figure 3 for S^3D-NeRF: Single-Shot Speech-Driven Neural Radiance Field for High Fidelity Talking Head Synthesis
Figure 4 for S^3D-NeRF: Single-Shot Speech-Driven Neural Radiance Field for High Fidelity Talking Head Synthesis
Viaarxiv icon

UniAnimate: Taming Unified Video Diffusion Models for Consistent Human Image Animation

Add code
Jun 03, 2024
Viaarxiv icon

A Recipe for Scaling up Text-to-Video Generation with Text-free Videos

Add code
Dec 25, 2023
Figure 1 for A Recipe for Scaling up Text-to-Video Generation with Text-free Videos
Figure 2 for A Recipe for Scaling up Text-to-Video Generation with Text-free Videos
Figure 3 for A Recipe for Scaling up Text-to-Video Generation with Text-free Videos
Figure 4 for A Recipe for Scaling up Text-to-Video Generation with Text-free Videos
Viaarxiv icon