Picture for Alice Oh

Alice Oh

KAIST

On the Effect of Uncertainty on Layer-wise Inference Dynamics

Add code
Jul 09, 2025
Viaarxiv icon

Spotting Out-of-Character Behavior: Atomic-Level Evaluation of Persona Fidelity in Open-Ended Generation

Add code
Jun 24, 2025
Viaarxiv icon

Flex-TravelPlanner: A Benchmark for Flexible Planning with Language Agents

Add code
Jun 05, 2025
Viaarxiv icon

BLUCK: A Benchmark Dataset for Bengali Linguistic Understanding and Cultural Knowledge

Add code
May 27, 2025
Viaarxiv icon

Trans-EnV: A Framework for Evaluating the Linguistic Robustness of LLMs Against English Varieties

Add code
May 27, 2025
Viaarxiv icon

MUG-Eval: A Proxy Evaluation Framework for Multilingual Generation Capabilities in Any Language

Add code
May 20, 2025
Viaarxiv icon

Team ACK at SemEval-2025 Task 2: Beyond Word-for-Word Machine Translation for English-Korean Pairs

Add code
Apr 29, 2025
Viaarxiv icon

Exploring Persona-dependent LLM Alignment for the Moral Machine Experiment

Add code
Apr 15, 2025
Viaarxiv icon

When Tom Eats Kimchi: Evaluating Cultural Bias of Multimodal Large Language Models in Cultural Mixture Contexts

Add code
Mar 21, 2025
Viaarxiv icon

Social Bias Benchmark for Generation: A Comparison of Generation and QA-Based Evaluations

Add code
Mar 10, 2025
Viaarxiv icon