Picture for Nishant Balepur

Nishant Balepur

Which of These Best Describes Multiple Choice Evaluation with LLMs? A) Forced B) Flawed C) Fixable D) All of the Above

Add code
Feb 19, 2025
Viaarxiv icon

Whose Boat Does it Float? Improving Personalization in Preference Tuning via Inferred User Personas

Add code
Jan 20, 2025
Viaarxiv icon

Reverse Question Answering: Can an LLM Write a Question so Hard (or Bad) that it Can't Answer?

Add code
Oct 20, 2024
Viaarxiv icon

Plausibly Problematic Questions in Multiple-Choice Benchmarks for Commonsense Reasoning

Add code
Oct 06, 2024
Viaarxiv icon

Is Your Large Language Model Knowledgeable or a Choices-Only Cheater?

Add code
Jul 02, 2024
Viaarxiv icon

A SMART Mnemonic Sounds like "Glue Tonic": Mixing LLMs with Student Feedback to Make Mnemonic Learning Stick

Add code
Jun 21, 2024
Viaarxiv icon

The Prompt Report: A Systematic Survey of Prompting Techniques

Add code
Jun 06, 2024
Viaarxiv icon

KARL: Knowledge-Aware Retrieval and Representations aid Retention and Learning in Students

Add code
Feb 19, 2024
Viaarxiv icon

Artifacts or Abduction: How Do LLMs Answer Multiple-Choice Questions Without the Question?

Add code
Feb 19, 2024
Viaarxiv icon

It's Not Easy Being Wrong: Evaluating Process of Elimination Reasoning in Large Language Models

Add code
Nov 13, 2023
Viaarxiv icon