Picture for Jakob Uszkoreit

Jakob Uszkoreit

Scene Representation Transformer: Geometry-Free Novel View Synthesis Through Set-Latent Scene Representations

Add code
Nov 29, 2021
Figure 1 for Scene Representation Transformer: Geometry-Free Novel View Synthesis Through Set-Latent Scene Representations
Figure 2 for Scene Representation Transformer: Geometry-Free Novel View Synthesis Through Set-Latent Scene Representations
Figure 3 for Scene Representation Transformer: Geometry-Free Novel View Synthesis Through Set-Latent Scene Representations
Figure 4 for Scene Representation Transformer: Geometry-Free Novel View Synthesis Through Set-Latent Scene Representations
Viaarxiv icon

How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers

Add code
Jun 18, 2021
Figure 1 for How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers
Figure 2 for How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers
Figure 3 for How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers
Figure 4 for How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers
Viaarxiv icon

MLP-Mixer: An all-MLP Architecture for Vision

Add code
May 17, 2021
Figure 1 for MLP-Mixer: An all-MLP Architecture for Vision
Figure 2 for MLP-Mixer: An all-MLP Architecture for Vision
Figure 3 for MLP-Mixer: An all-MLP Architecture for Vision
Figure 4 for MLP-Mixer: An all-MLP Architecture for Vision
Viaarxiv icon

Differentiable Patch Selection for Image Recognition

Add code
Apr 07, 2021
Figure 1 for Differentiable Patch Selection for Image Recognition
Figure 2 for Differentiable Patch Selection for Image Recognition
Figure 3 for Differentiable Patch Selection for Image Recognition
Figure 4 for Differentiable Patch Selection for Image Recognition
Viaarxiv icon

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Add code
Oct 22, 2020
Figure 1 for An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Figure 2 for An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Figure 3 for An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Figure 4 for An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Viaarxiv icon

Towards End-to-End In-Image Neural Machine Translation

Add code
Oct 20, 2020
Figure 1 for Towards End-to-End In-Image Neural Machine Translation
Figure 2 for Towards End-to-End In-Image Neural Machine Translation
Figure 3 for Towards End-to-End In-Image Neural Machine Translation
Figure 4 for Towards End-to-End In-Image Neural Machine Translation
Viaarxiv icon

Object-Centric Learning with Slot Attention

Add code
Jun 26, 2020
Viaarxiv icon

An Empirical Study of Generation Order for Machine Translation

Add code
Oct 29, 2019
Figure 1 for An Empirical Study of Generation Order for Machine Translation
Figure 2 for An Empirical Study of Generation Order for Machine Translation
Figure 3 for An Empirical Study of Generation Order for Machine Translation
Figure 4 for An Empirical Study of Generation Order for Machine Translation
Viaarxiv icon

Scaling Autoregressive Video Models

Add code
Jun 06, 2019
Figure 1 for Scaling Autoregressive Video Models
Figure 2 for Scaling Autoregressive Video Models
Figure 3 for Scaling Autoregressive Video Models
Figure 4 for Scaling Autoregressive Video Models
Viaarxiv icon

KERMIT: Generative Insertion-Based Modeling for Sequences

Add code
Jun 04, 2019
Figure 1 for KERMIT: Generative Insertion-Based Modeling for Sequences
Figure 2 for KERMIT: Generative Insertion-Based Modeling for Sequences
Figure 3 for KERMIT: Generative Insertion-Based Modeling for Sequences
Figure 4 for KERMIT: Generative Insertion-Based Modeling for Sequences
Viaarxiv icon