Picture for Wenyan Li

Wenyan Li

FoodieQA: A Multimodal Dataset for Fine-Grained Understanding of Chinese Food Culture

Add code
Jun 16, 2024
Figure 1 for FoodieQA: A Multimodal Dataset for Fine-Grained Understanding of Chinese Food Culture
Figure 2 for FoodieQA: A Multimodal Dataset for Fine-Grained Understanding of Chinese Food Culture
Figure 3 for FoodieQA: A Multimodal Dataset for Fine-Grained Understanding of Chinese Food Culture
Figure 4 for FoodieQA: A Multimodal Dataset for Fine-Grained Understanding of Chinese Food Culture
Viaarxiv icon

Words Worth a Thousand Pictures: Measuring and Understanding Perceptual Variability in Text-to-Image Generation

Add code
Jun 12, 2024
Viaarxiv icon

Understanding Retrieval Robustness for Retrieval-Augmented Image Captioning

Add code
Jun 04, 2024
Viaarxiv icon

Exploring Visual Culture Awareness in GPT-4V: A Comprehensive Probing

Add code
Feb 15, 2024
Viaarxiv icon

Data Curation for Image Captioning with Text-to-Image Generative Models

Add code
May 05, 2023
Viaarxiv icon