Picture for Ishaan Watts

Ishaan Watts

PARIKSHA : A Large-Scale Investigation of Human-LLM Evaluator Agreement on Multilingual and Multi-Cultural Data

Add code
Jun 21, 2024
Viaarxiv icon

RTP-LX: Can LLMs Evaluate Toxicity in Multilingual Scenarios?

Add code
Apr 22, 2024
Viaarxiv icon

MEGAVERSE: Benchmarking Large Language Models Across Languages, Modalities, Models and Tasks

Add code
Nov 13, 2023
Viaarxiv icon