Picture for Shudong Liu

Shudong Liu

CultureVLM: Characterizing and Improving Cultural Understanding of Vision-Language Models for over 100 Countries

Add code
Jan 02, 2025
Viaarxiv icon

Learning from "Silly" Questions Improves Large Language Models, But Only Slightly

Add code
Nov 21, 2024
Viaarxiv icon

Is Your Model Really A Good Math Reasoner? Evaluating Mathematical Reasoning with Checklist

Add code
Jul 11, 2024
Figure 1 for Is Your Model Really A Good Math Reasoner? Evaluating Mathematical Reasoning with Checklist
Figure 2 for Is Your Model Really A Good Math Reasoner? Evaluating Mathematical Reasoning with Checklist
Figure 3 for Is Your Model Really A Good Math Reasoner? Evaluating Mathematical Reasoning with Checklist
Figure 4 for Is Your Model Really A Good Math Reasoner? Evaluating Mathematical Reasoning with Checklist
Viaarxiv icon

TempoSum: Evaluating the Temporal Generalization of Abstractive Summarization

Add code
May 03, 2023
Viaarxiv icon