Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Kvasir-VQA: A Text-Image Pair GI Tract Dataset

Sep 02, 2024

Sushant Gautam, Andrea Storås, Cise Midoglu, Steven A. Hicks, Vajira Thambawita, Pål Halvorsen, Michael A. Riegler

Figure 1 for Kvasir-VQA: A Text-Image Pair GI Tract Dataset

Figure 2 for Kvasir-VQA: A Text-Image Pair GI Tract Dataset

Figure 3 for Kvasir-VQA: A Text-Image Pair GI Tract Dataset

Figure 4 for Kvasir-VQA: A Text-Image Pair GI Tract Dataset

Share this with someone who'll enjoy it:

Abstract:We introduce Kvasir-VQA, an extended dataset derived from the HyperKvasir and Kvasir-Instrument datasets, augmented with question-and-answer annotations to facilitate advanced machine learning tasks in Gastrointestinal (GI) diagnostics. This dataset comprises 6,500 annotated images spanning various GI tract conditions and surgical instruments, and it supports multiple question types including yes/no, choice, location, and numerical count. The dataset is intended for applications such as image captioning, Visual Question Answering (VQA), text-based generation of synthetic medical images, object detection, and classification. Our experiments demonstrate the dataset's effectiveness in training models for three selected tasks, showcasing significant applications in medical image analysis and diagnostics. We also present evaluation metrics for each task, highlighting the usability and versatility of our dataset. The dataset and supporting artifacts are available at https://datasets.simula.no/kvasir-vqa.

* to be published in VLM4Bio 2024, part of the ACM Multimedia (ACM MM) conference 2024

View paper on

Share this with someone who'll enjoy it:

Title:Kvasir-VQA: A Text-Image Pair GI Tract Dataset

Paper and Code