Picture for Willi Menapace

Willi Menapace

Multi-subject Open-set Personalization in Video Generation

Add code
Jan 10, 2025
Viaarxiv icon

AV-Link: Temporally-Aligned Diffusion Features for Cross-Modal Audio-Video Generation

Add code
Dec 19, 2024
Figure 1 for AV-Link: Temporally-Aligned Diffusion Features for Cross-Modal Audio-Video Generation
Figure 2 for AV-Link: Temporally-Aligned Diffusion Features for Cross-Modal Audio-Video Generation
Figure 3 for AV-Link: Temporally-Aligned Diffusion Features for Cross-Modal Audio-Video Generation
Figure 4 for AV-Link: Temporally-Aligned Diffusion Features for Cross-Modal Audio-Video Generation
Viaarxiv icon

Mind the Time: Temporally-Controlled Multi-Event Video Generation

Add code
Dec 06, 2024
Viaarxiv icon

4Real-Video: Learning Generalizable Photo-Realistic 4D Video Diffusion

Add code
Dec 05, 2024
Figure 1 for 4Real-Video: Learning Generalizable Photo-Realistic 4D Video Diffusion
Figure 2 for 4Real-Video: Learning Generalizable Photo-Realistic 4D Video Diffusion
Figure 3 for 4Real-Video: Learning Generalizable Photo-Realistic 4D Video Diffusion
Figure 4 for 4Real-Video: Learning Generalizable Photo-Realistic 4D Video Diffusion
Viaarxiv icon

AC3D: Analyzing and Improving 3D Camera Control in Video Diffusion Transformers

Add code
Dec 02, 2024
Viaarxiv icon

AsCAN: Asymmetric Convolution-Attention Networks for Efficient Recognition and Generation

Add code
Nov 07, 2024
Figure 1 for AsCAN: Asymmetric Convolution-Attention Networks for Efficient Recognition and Generation
Figure 2 for AsCAN: Asymmetric Convolution-Attention Networks for Efficient Recognition and Generation
Figure 3 for AsCAN: Asymmetric Convolution-Attention Networks for Efficient Recognition and Generation
Figure 4 for AsCAN: Asymmetric Convolution-Attention Networks for Efficient Recognition and Generation
Viaarxiv icon

VD3D: Taming Large Video Diffusion Transformers for 3D Camera Control

Add code
Jul 17, 2024
Figure 1 for VD3D: Taming Large Video Diffusion Transformers for 3D Camera Control
Figure 2 for VD3D: Taming Large Video Diffusion Transformers for 3D Camera Control
Figure 3 for VD3D: Taming Large Video Diffusion Transformers for 3D Camera Control
Figure 4 for VD3D: Taming Large Video Diffusion Transformers for 3D Camera Control
Viaarxiv icon

VIMI: Grounding Video Generation through Multi-modal Instruction

Add code
Jul 08, 2024
Figure 1 for VIMI: Grounding Video Generation through Multi-modal Instruction
Figure 2 for VIMI: Grounding Video Generation through Multi-modal Instruction
Figure 3 for VIMI: Grounding Video Generation through Multi-modal Instruction
Figure 4 for VIMI: Grounding Video Generation through Multi-modal Instruction
Viaarxiv icon

Taming Data and Transformers for Audio Generation

Add code
Jun 27, 2024
Viaarxiv icon

Hierarchical Patch Diffusion Models for High-Resolution Video Generation

Add code
Jun 12, 2024
Viaarxiv icon