Picture for Eunsu Kim

Eunsu Kim

When Tom Eats Kimchi: Evaluating Cultural Bias of Multimodal Large Language Models in Cultural Mixture Contexts

Add code
Mar 21, 2025
Viaarxiv icon

Diffusion Models Through a Global Lens: Are They Culturally Inclusive?

Add code
Feb 13, 2025
Viaarxiv icon

LLM-AS-AN-INTERVIEWER: Beyond Static Testing Through Dynamic LLM Evaluation

Add code
Dec 10, 2024
Viaarxiv icon

Uncovering Factor Level Preferences to Improve Human-Model Alignment

Add code
Oct 09, 2024
Figure 1 for Uncovering Factor Level Preferences to Improve Human-Model Alignment
Figure 2 for Uncovering Factor Level Preferences to Improve Human-Model Alignment
Figure 3 for Uncovering Factor Level Preferences to Improve Human-Model Alignment
Figure 4 for Uncovering Factor Level Preferences to Improve Human-Model Alignment
Viaarxiv icon

BLEnD: A Benchmark for LLMs on Everyday Knowledge in Diverse Cultures and Languages

Add code
Jun 14, 2024
Viaarxiv icon

CLIcK: A Benchmark Dataset of Cultural and Linguistic Intelligence in Korean

Add code
Mar 15, 2024
Viaarxiv icon

Multi-FAct: Assessing Multilingual LLMs' Multi-Regional Knowledge using FActScore

Add code
Mar 01, 2024
Viaarxiv icon

The Generative AI Paradox on Evaluation: What It Can Solve, It May Not Evaluate

Add code
Feb 09, 2024
Figure 1 for The Generative AI Paradox on Evaluation: What It Can Solve, It May Not Evaluate
Figure 2 for The Generative AI Paradox on Evaluation: What It Can Solve, It May Not Evaluate
Figure 3 for The Generative AI Paradox on Evaluation: What It Can Solve, It May Not Evaluate
Figure 4 for The Generative AI Paradox on Evaluation: What It Can Solve, It May Not Evaluate
Viaarxiv icon