Picture for Jonathan Huang

Jonathan Huang

Learning Complex Non-Rigid Image Edits from Multimodal Conditioning

Add code
Dec 13, 2024
Viaarxiv icon

Principles of Visual Tokens for Efficient Video Understanding

Add code
Nov 20, 2024
Viaarxiv icon

Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think

Add code
Oct 09, 2024
Figure 1 for Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think
Figure 2 for Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think
Figure 3 for Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think
Figure 4 for Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think
Viaarxiv icon

Tree-D Fusion: Simulation-Ready Tree Dataset from Single Images with Diffusion Priors

Add code
Jul 14, 2024
Viaarxiv icon

Learning Hierarchical Semantic Classification by Grounding on Consistent Image Segmentations

Add code
Jun 17, 2024
Viaarxiv icon

VideoPoet: A Large Language Model for Zero-Shot Video Generation

Add code
Dec 21, 2023
Figure 1 for VideoPoet: A Large Language Model for Zero-Shot Video Generation
Figure 2 for VideoPoet: A Large Language Model for Zero-Shot Video Generation
Figure 3 for VideoPoet: A Large Language Model for Zero-Shot Video Generation
Figure 4 for VideoPoet: A Large Language Model for Zero-Shot Video Generation
Viaarxiv icon

Text and Click inputs for unambiguous open vocabulary instance segmentation

Add code
Nov 24, 2023
Viaarxiv icon

Optimizing ViViT Training: Time and Memory Reduction for Action Recognition

Add code
Jun 07, 2023
Viaarxiv icon

DaTaSeg: Taming a Universal Multi-Dataset Multi-Task Segmentation Model

Add code
Jun 02, 2023
Figure 1 for DaTaSeg: Taming a Universal Multi-Dataset Multi-Task Segmentation Model
Figure 2 for DaTaSeg: Taming a Universal Multi-Dataset Multi-Task Segmentation Model
Figure 3 for DaTaSeg: Taming a Universal Multi-Dataset Multi-Task Segmentation Model
Figure 4 for DaTaSeg: Taming a Universal Multi-Dataset Multi-Task Segmentation Model
Viaarxiv icon

Learning to Detect Novel and Fine-Grained Acoustic Sequences Using Pretrained Audio Representations

Add code
May 03, 2023
Viaarxiv icon