Picture for Kristina Toutanova

Kristina Toutanova

BgGPT 1.0: Extending English-centric LLMs to other languages

Add code
Dec 14, 2024
Viaarxiv icon

Understanding the World's Museums through Vision-Language Reasoning

Add code
Dec 02, 2024
Viaarxiv icon

ALTA: Compiler-Based Analysis of Transformers

Add code
Oct 23, 2024
Figure 1 for ALTA: Compiler-Based Analysis of Transformers
Figure 2 for ALTA: Compiler-Based Analysis of Transformers
Figure 3 for ALTA: Compiler-Based Analysis of Transformers
Figure 4 for ALTA: Compiler-Based Analysis of Transformers
Viaarxiv icon

Taming CLIP for Fine-grained and Structured Visual Understanding of Museum Exhibits

Add code
Sep 03, 2024
Viaarxiv icon

Efficient End-to-End Visual Document Understanding with Rationale Distillation

Add code
Nov 16, 2023
Figure 1 for Efficient End-to-End Visual Document Understanding with Rationale Distillation
Figure 2 for Efficient End-to-End Visual Document Understanding with Rationale Distillation
Figure 3 for Efficient End-to-End Visual Document Understanding with Rationale Distillation
Figure 4 for Efficient End-to-End Visual Document Understanding with Rationale Distillation
Viaarxiv icon

From Pixels to UI Actions: Learning to Follow Instructions via Graphical User Interfaces

Add code
May 31, 2023
Figure 1 for From Pixels to UI Actions: Learning to Follow Instructions via Graphical User Interfaces
Figure 2 for From Pixels to UI Actions: Learning to Follow Instructions via Graphical User Interfaces
Figure 3 for From Pixels to UI Actions: Learning to Follow Instructions via Graphical User Interfaces
Figure 4 for From Pixels to UI Actions: Learning to Follow Instructions via Graphical User Interfaces
Viaarxiv icon

Anchor Prediction: Automatic Refinement of Internet Links

Add code
May 24, 2023
Viaarxiv icon

QUEST: A Retrieval Dataset of Entity-Seeking Queries with Implicit Set Operations

Add code
May 19, 2023
Viaarxiv icon

Open-domain Visual Entity Recognition: Towards Recognizing Millions of Wikipedia Entities

Add code
Feb 24, 2023
Figure 1 for Open-domain Visual Entity Recognition: Towards Recognizing Millions of Wikipedia Entities
Figure 2 for Open-domain Visual Entity Recognition: Towards Recognizing Millions of Wikipedia Entities
Figure 3 for Open-domain Visual Entity Recognition: Towards Recognizing Millions of Wikipedia Entities
Figure 4 for Open-domain Visual Entity Recognition: Towards Recognizing Millions of Wikipedia Entities
Viaarxiv icon

Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding

Add code
Oct 07, 2022
Figure 1 for Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding
Figure 2 for Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding
Figure 3 for Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding
Figure 4 for Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding
Viaarxiv icon