Picture for Shudong Liu

Shudong Liu

Learning from "Silly" Questions Improves Large Language Models, But Only Slightly

Add code
Nov 21, 2024
Viaarxiv icon

Is Your Model Really A Good Math Reasoner? Evaluating Mathematical Reasoning with Checklist

Add code
Jul 11, 2024
Figure 1 for Is Your Model Really A Good Math Reasoner? Evaluating Mathematical Reasoning with Checklist
Figure 2 for Is Your Model Really A Good Math Reasoner? Evaluating Mathematical Reasoning with Checklist
Figure 3 for Is Your Model Really A Good Math Reasoner? Evaluating Mathematical Reasoning with Checklist
Figure 4 for Is Your Model Really A Good Math Reasoner? Evaluating Mathematical Reasoning with Checklist
Viaarxiv icon

TempoSum: Evaluating the Temporal Generalization of Abstractive Summarization

Add code
May 03, 2023
Viaarxiv icon