Picture for Xi Yin

Xi Yin

Movie Gen: A Cast of Media Foundation Models

Add code
Oct 17, 2024
Figure 1 for Movie Gen: A Cast of Media Foundation Models
Figure 2 for Movie Gen: A Cast of Media Foundation Models
Figure 3 for Movie Gen: A Cast of Media Foundation Models
Figure 4 for Movie Gen: A Cast of Media Foundation Models
Viaarxiv icon

Proactive Schemes: A Survey of Adversarial Attacks for Social Good

Add code
Sep 24, 2024
Figure 1 for Proactive Schemes: A Survey of Adversarial Attacks for Social Good
Figure 2 for Proactive Schemes: A Survey of Adversarial Attacks for Social Good
Figure 3 for Proactive Schemes: A Survey of Adversarial Attacks for Social Good
Figure 4 for Proactive Schemes: A Survey of Adversarial Attacks for Social Good
Viaarxiv icon

AcademicGPT: Empowering Academic Research

Add code
Nov 21, 2023
Viaarxiv icon

Emu Video: Factorizing Text-to-Video Generation by Explicit Image Conditioning

Add code
Nov 17, 2023
Viaarxiv icon

Towards Effective and Compact Contextual Representation for Conformer Transducer Speech Recognition Systems

Add code
Jun 26, 2023
Viaarxiv icon

Latent-Shift: Latent Diffusion with Temporal Shift for Efficient Text-to-Video Generation

Add code
Apr 18, 2023
Viaarxiv icon

MaLP: Manipulation Localization Using a Proactive Scheme

Add code
Apr 04, 2023
Viaarxiv icon

SpaText: Spatio-Textual Representation for Controllable Image Generation

Add code
Nov 25, 2022
Viaarxiv icon

Make-A-Video: Text-to-Video Generation without Text-Video Data

Add code
Sep 29, 2022
Figure 1 for Make-A-Video: Text-to-Video Generation without Text-Video Data
Figure 2 for Make-A-Video: Text-to-Video Generation without Text-Video Data
Figure 3 for Make-A-Video: Text-to-Video Generation without Text-Video Data
Figure 4 for Make-A-Video: Text-to-Video Generation without Text-Video Data
Viaarxiv icon

MUGEN: A Playground for Video-Audio-Text Multimodal Understanding and GENeration

Add code
Apr 28, 2022
Figure 1 for MUGEN: A Playground for Video-Audio-Text Multimodal Understanding and GENeration
Figure 2 for MUGEN: A Playground for Video-Audio-Text Multimodal Understanding and GENeration
Figure 3 for MUGEN: A Playground for Video-Audio-Text Multimodal Understanding and GENeration
Figure 4 for MUGEN: A Playground for Video-Audio-Text Multimodal Understanding and GENeration
Viaarxiv icon