Which of These Best Describes Multiple Choice Evaluation with LLMs? A) Forced B) Flawed C) Fixable D) All of the Above

Add code
Feb 19, 2025

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: