Picture for Sergey Tulyakov

Sergey Tulyakov

AsCAN: Asymmetric Convolution-Attention Networks for Efficient Recognition and Generation

Add code
Nov 07, 2024
Viaarxiv icon

DELTA: Dense Efficient Long-range 3D Tracking for any video

Add code
Oct 31, 2024
Viaarxiv icon

Scalable Ranked Preference Optimization for Text-to-Image Generation

Add code
Oct 23, 2024
Figure 1 for Scalable Ranked Preference Optimization for Text-to-Image Generation
Figure 2 for Scalable Ranked Preference Optimization for Text-to-Image Generation
Figure 3 for Scalable Ranked Preference Optimization for Text-to-Image Generation
Figure 4 for Scalable Ranked Preference Optimization for Text-to-Image Generation
Viaarxiv icon

ControlMM: Controllable Masked Motion Generation

Add code
Oct 14, 2024
Viaarxiv icon

Pixel-Aligned Multi-View Generation with Depth Guided Decoder

Add code
Aug 26, 2024
Viaarxiv icon

VD3D: Taming Large Video Diffusion Transformers for 3D Camera Control

Add code
Jul 17, 2024
Viaarxiv icon

Efficient Training with Denoised Neural Weights

Add code
Jul 16, 2024
Figure 1 for Efficient Training with Denoised Neural Weights
Figure 2 for Efficient Training with Denoised Neural Weights
Figure 3 for Efficient Training with Denoised Neural Weights
Figure 4 for Efficient Training with Denoised Neural Weights
Viaarxiv icon

VIMI: Grounding Video Generation through Multi-modal Instruction

Add code
Jul 08, 2024
Figure 1 for VIMI: Grounding Video Generation through Multi-modal Instruction
Figure 2 for VIMI: Grounding Video Generation through Multi-modal Instruction
Figure 3 for VIMI: Grounding Video Generation through Multi-modal Instruction
Figure 4 for VIMI: Grounding Video Generation through Multi-modal Instruction
Viaarxiv icon

Lightweight Predictive 3D Gaussian Splats

Add code
Jun 27, 2024
Viaarxiv icon

Taming Data and Transformers for Audio Generation

Add code
Jun 27, 2024
Viaarxiv icon