Picture for Alexander Ku

Alexander Ku

Understanding the Limits of Vision Language Models Through the Lens of the Binding Problem

Add code
Oct 31, 2024
Viaarxiv icon

DOCCI: Descriptions of Connected and Contrasting Images

Add code
Apr 30, 2024
Viaarxiv icon

Prompt Expansion for Adaptive Text-to-Image Generation

Add code
Dec 27, 2023
Figure 1 for Prompt Expansion for Adaptive Text-to-Image Generation
Figure 2 for Prompt Expansion for Adaptive Text-to-Image Generation
Figure 3 for Prompt Expansion for Adaptive Text-to-Image Generation
Figure 4 for Prompt Expansion for Adaptive Text-to-Image Generation
Viaarxiv icon

Gaussian Process Probes (GPP) for Uncertainty-Aware Probing

Add code
May 29, 2023
Figure 1 for Gaussian Process Probes (GPP) for Uncertainty-Aware Probing
Figure 2 for Gaussian Process Probes (GPP) for Uncertainty-Aware Probing
Figure 3 for Gaussian Process Probes (GPP) for Uncertainty-Aware Probing
Figure 4 for Gaussian Process Probes (GPP) for Uncertainty-Aware Probing
Viaarxiv icon

A New Path: Scaling Vision-and-Language Navigation with Synthetic Instructions and Imitation Learning

Add code
Oct 06, 2022
Figure 1 for A New Path: Scaling Vision-and-Language Navigation with Synthetic Instructions and Imitation Learning
Figure 2 for A New Path: Scaling Vision-and-Language Navigation with Synthetic Instructions and Imitation Learning
Figure 3 for A New Path: Scaling Vision-and-Language Navigation with Synthetic Instructions and Imitation Learning
Figure 4 for A New Path: Scaling Vision-and-Language Navigation with Synthetic Instructions and Imitation Learning
Viaarxiv icon

Scaling Autoregressive Models for Content-Rich Text-to-Image Generation

Add code
Jun 22, 2022
Figure 1 for Scaling Autoregressive Models for Content-Rich Text-to-Image Generation
Figure 2 for Scaling Autoregressive Models for Content-Rich Text-to-Image Generation
Figure 3 for Scaling Autoregressive Models for Content-Rich Text-to-Image Generation
Figure 4 for Scaling Autoregressive Models for Content-Rich Text-to-Image Generation
Viaarxiv icon

Vector-quantized Image Modeling with Improved VQGAN

Add code
Oct 09, 2021
Figure 1 for Vector-quantized Image Modeling with Improved VQGAN
Figure 2 for Vector-quantized Image Modeling with Improved VQGAN
Figure 3 for Vector-quantized Image Modeling with Improved VQGAN
Figure 4 for Vector-quantized Image Modeling with Improved VQGAN
Viaarxiv icon

PanGEA: The Panoramic Graph Environment Annotation Toolkit

Add code
Mar 23, 2021
Figure 1 for PanGEA: The Panoramic Graph Environment Annotation Toolkit
Figure 2 for PanGEA: The Panoramic Graph Environment Annotation Toolkit
Viaarxiv icon

On the Evaluation of Vision-and-Language Navigation Instructions

Add code
Jan 26, 2021
Figure 1 for On the Evaluation of Vision-and-Language Navigation Instructions
Figure 2 for On the Evaluation of Vision-and-Language Navigation Instructions
Figure 3 for On the Evaluation of Vision-and-Language Navigation Instructions
Figure 4 for On the Evaluation of Vision-and-Language Navigation Instructions
Viaarxiv icon

Room-Across-Room: Multilingual Vision-and-Language Navigation with Dense Spatiotemporal Grounding

Add code
Oct 15, 2020
Figure 1 for Room-Across-Room: Multilingual Vision-and-Language Navigation with Dense Spatiotemporal Grounding
Figure 2 for Room-Across-Room: Multilingual Vision-and-Language Navigation with Dense Spatiotemporal Grounding
Figure 3 for Room-Across-Room: Multilingual Vision-and-Language Navigation with Dense Spatiotemporal Grounding
Figure 4 for Room-Across-Room: Multilingual Vision-and-Language Navigation with Dense Spatiotemporal Grounding
Viaarxiv icon