Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:FoodieQA: A Multimodal Dataset for Fine-Grained Understanding of Chinese Food Culture

Jun 16, 2024

Wenyan Li, Xinyu Zhang, Jiaang Li, Qiwei Peng, Raphael Tang, Li Zhou, Weijia Zhang, Guimin Hu, Yifei Yuan, Anders Søgaard(+2 more)

Figure 1 for FoodieQA: A Multimodal Dataset for Fine-Grained Understanding of Chinese Food Culture

Figure 2 for FoodieQA: A Multimodal Dataset for Fine-Grained Understanding of Chinese Food Culture

Figure 3 for FoodieQA: A Multimodal Dataset for Fine-Grained Understanding of Chinese Food Culture

Figure 4 for FoodieQA: A Multimodal Dataset for Fine-Grained Understanding of Chinese Food Culture

Share this with someone who'll enjoy it:

Abstract:Food is a rich and varied dimension of cultural heritage, crucial to both individuals and social groups. To bridge the gap in the literature on the often-overlooked regional diversity in this domain, we introduce FoodieQA, a manually curated, fine-grained image-text dataset capturing the intricate features of food cultures across various regions in China. We evaluate vision-language Models (VLMs) and large language models (LLMs) on newly collected, unseen food images and corresponding questions. FoodieQA comprises three multiple-choice question-answering tasks where models need to answer questions based on multiple images, a single image, and text-only descriptions, respectively. While LLMs excel at text-based question answering, surpassing human accuracy, the open-sourced VLMs still fall short by 41\% on multi-image and 21\% on single-image VQA tasks, although closed-weights models perform closer to human levels (within 10\%). Our findings highlight that understanding food and its cultural implications remains a challenging and under-explored direction.

View paper on

Share this with someone who'll enjoy it:

Title:FoodieQA: A Multimodal Dataset for Fine-Grained Understanding of Chinese Food Culture

Paper and Code