Picture for Mannat Singh

Mannat Singh

Jack

Movie Gen: A Cast of Media Foundation Models

Add code
Oct 17, 2024
Figure 1 for Movie Gen: A Cast of Media Foundation Models
Figure 2 for Movie Gen: A Cast of Media Foundation Models
Figure 3 for Movie Gen: A Cast of Media Foundation Models
Figure 4 for Movie Gen: A Cast of Media Foundation Models
Viaarxiv icon

The Llama 3 Herd of Models

Add code
Jul 31, 2024
Viaarxiv icon

Emu Video: Factorizing Text-to-Video Generation by Explicit Image Conditioning

Add code
Nov 17, 2023
Viaarxiv icon

ImageBind: One Embedding Space To Bind Them All

Add code
May 09, 2023
Viaarxiv icon

The effectiveness of MAE pre-pretraining for billion-scale pretraining

Add code
Mar 23, 2023
Figure 1 for The effectiveness of MAE pre-pretraining for billion-scale pretraining
Figure 2 for The effectiveness of MAE pre-pretraining for billion-scale pretraining
Figure 3 for The effectiveness of MAE pre-pretraining for billion-scale pretraining
Figure 4 for The effectiveness of MAE pre-pretraining for billion-scale pretraining
Viaarxiv icon

OmniMAE: Single Model Masked Pretraining on Images and Videos

Add code
Jun 16, 2022
Figure 1 for OmniMAE: Single Model Masked Pretraining on Images and Videos
Figure 2 for OmniMAE: Single Model Masked Pretraining on Images and Videos
Figure 3 for OmniMAE: Single Model Masked Pretraining on Images and Videos
Figure 4 for OmniMAE: Single Model Masked Pretraining on Images and Videos
Viaarxiv icon

Omnivore: A Single Model for Many Visual Modalities

Add code
Jan 20, 2022
Figure 1 for Omnivore: A Single Model for Many Visual Modalities
Figure 2 for Omnivore: A Single Model for Many Visual Modalities
Figure 3 for Omnivore: A Single Model for Many Visual Modalities
Figure 4 for Omnivore: A Single Model for Many Visual Modalities
Viaarxiv icon

Revisiting Weakly Supervised Pre-Training of Visual Perception Models

Add code
Jan 20, 2022
Figure 1 for Revisiting Weakly Supervised Pre-Training of Visual Perception Models
Figure 2 for Revisiting Weakly Supervised Pre-Training of Visual Perception Models
Figure 3 for Revisiting Weakly Supervised Pre-Training of Visual Perception Models
Figure 4 for Revisiting Weakly Supervised Pre-Training of Visual Perception Models
Viaarxiv icon

Early Convolutions Help Transformers See Better

Add code
Jul 12, 2021
Figure 1 for Early Convolutions Help Transformers See Better
Figure 2 for Early Convolutions Help Transformers See Better
Figure 3 for Early Convolutions Help Transformers See Better
Figure 4 for Early Convolutions Help Transformers See Better
Viaarxiv icon

MDETR -- Modulated Detection for End-to-End Multi-Modal Understanding

Add code
Apr 26, 2021
Figure 1 for MDETR -- Modulated Detection for End-to-End Multi-Modal Understanding
Figure 2 for MDETR -- Modulated Detection for End-to-End Multi-Modal Understanding
Figure 3 for MDETR -- Modulated Detection for End-to-End Multi-Modal Understanding
Figure 4 for MDETR -- Modulated Detection for End-to-End Multi-Modal Understanding
Viaarxiv icon