Picture for Ishan Misra

Ishan Misra

Jack

Generating Multi-Image Synthetic Data for Text-to-Image Customization

Add code
Feb 03, 2025
Viaarxiv icon

Diffusion Autoencoders are Scalable Image Tokenizers

Add code
Jan 30, 2025
Viaarxiv icon

LLMs can see and hear without any training

Add code
Jan 30, 2025
Viaarxiv icon

CAT: Content-Adaptive Image Tokenization

Add code
Jan 06, 2025
Viaarxiv icon

Movie Gen: A Cast of Media Foundation Models

Add code
Oct 17, 2024
Figure 1 for Movie Gen: A Cast of Media Foundation Models
Figure 2 for Movie Gen: A Cast of Media Foundation Models
Figure 3 for Movie Gen: A Cast of Media Foundation Models
Figure 4 for Movie Gen: A Cast of Media Foundation Models
Viaarxiv icon

The Llama 3 Herd of Models

Add code
Jul 31, 2024
Viaarxiv icon

InstanceDiffusion: Instance-level Control for Image Generation

Add code
Feb 05, 2024
Viaarxiv icon

FlowVid: Taming Imperfect Optical Flows for Consistent Video-to-Video Synthesis

Add code
Dec 29, 2023
Viaarxiv icon

Generating Illustrated Instructions

Add code
Dec 07, 2023
Viaarxiv icon

On Bringing Robots Home

Add code
Nov 27, 2023
Viaarxiv icon