Picture for Or Honovich

Or Honovich

Keep Guessing? When Considering Inference Scaling, Mind the Baselines

Add code
Oct 20, 2024
Viaarxiv icon

A Chain-of-Thought Is as Strong as Its Weakest Link: A Benchmark for Verifiers of Reasoning Chains

Add code
Feb 02, 2024
Viaarxiv icon

Surfacing Biases in Large Language Models using Contrastive Input Decoding

Add code
May 12, 2023
Viaarxiv icon

Unnatural Instructions: Tuning Language Models with (Almost) No Human Labor

Add code
Dec 19, 2022
Viaarxiv icon

DisentQA: Disentangling Parametric and Contextual Knowledge with Counterfactual Question Answering

Add code
Nov 10, 2022
Viaarxiv icon

LMentry: A Language Model Benchmark of Elementary Language Tasks

Add code
Nov 03, 2022
Viaarxiv icon

Instruction Induction: From Few Examples to Natural Language Task Descriptions

Add code
May 22, 2022
Figure 1 for Instruction Induction: From Few Examples to Natural Language Task Descriptions
Figure 2 for Instruction Induction: From Few Examples to Natural Language Task Descriptions
Figure 3 for Instruction Induction: From Few Examples to Natural Language Task Descriptions
Figure 4 for Instruction Induction: From Few Examples to Natural Language Task Descriptions
Viaarxiv icon

TRUE: Re-evaluating Factual Consistency Evaluation

Add code
Apr 11, 2022
Figure 1 for TRUE: Re-evaluating Factual Consistency Evaluation
Figure 2 for TRUE: Re-evaluating Factual Consistency Evaluation
Figure 3 for TRUE: Re-evaluating Factual Consistency Evaluation
Figure 4 for TRUE: Re-evaluating Factual Consistency Evaluation
Viaarxiv icon

$Q^{2}$: Evaluating Factual Consistency in Knowledge-Grounded Dialogues via Question Generation and Question Answering

Add code
Apr 16, 2021
Figure 1 for $Q^{2}$: Evaluating Factual Consistency in Knowledge-Grounded Dialogues via Question Generation and Question Answering
Figure 2 for $Q^{2}$: Evaluating Factual Consistency in Knowledge-Grounded Dialogues via Question Generation and Question Answering
Figure 3 for $Q^{2}$: Evaluating Factual Consistency in Knowledge-Grounded Dialogues via Question Generation and Question Answering
Figure 4 for $Q^{2}$: Evaluating Factual Consistency in Knowledge-Grounded Dialogues via Question Generation and Question Answering
Viaarxiv icon