Picture for Taichi Iki

Taichi Iki

InstructDoc: A Dataset for Zero-Shot Generalization of Visual Document Understanding with Instructions

Add code
Jan 24, 2024
Viaarxiv icon

Do BERTs Learn to Use Browser User Interface? Exploring Multi-Step Tasks with Unified Vision-and-Language BERTs

Add code
Mar 15, 2022
Figure 1 for Do BERTs Learn to Use Browser User Interface? Exploring Multi-Step Tasks with Unified Vision-and-Language BERTs
Figure 2 for Do BERTs Learn to Use Browser User Interface? Exploring Multi-Step Tasks with Unified Vision-and-Language BERTs
Figure 3 for Do BERTs Learn to Use Browser User Interface? Exploring Multi-Step Tasks with Unified Vision-and-Language BERTs
Figure 4 for Do BERTs Learn to Use Browser User Interface? Exploring Multi-Step Tasks with Unified Vision-and-Language BERTs
Viaarxiv icon

Effect of Vision-and-Language Extensions on Natural Language Understanding in Vision-and-Language Models

Add code
Apr 16, 2021
Figure 1 for Effect of Vision-and-Language Extensions on Natural Language Understanding in Vision-and-Language Models
Figure 2 for Effect of Vision-and-Language Extensions on Natural Language Understanding in Vision-and-Language Models
Figure 3 for Effect of Vision-and-Language Extensions on Natural Language Understanding in Vision-and-Language Models
Figure 4 for Effect of Vision-and-Language Extensions on Natural Language Understanding in Vision-and-Language Models
Viaarxiv icon