Picture for Xuanxin Wu

Xuanxin Wu

Reasoning Model Is Superior LLM-Judge, Yet Suffers from Biases

Add code
Jan 07, 2026
Viaarxiv icon

An In-depth Evaluation of GPT-4 in Sentence Simplification with Error-based Human Assessment

Add code
Mar 08, 2024
Figure 1 for An In-depth Evaluation of GPT-4 in Sentence Simplification with Error-based Human Assessment
Figure 2 for An In-depth Evaluation of GPT-4 in Sentence Simplification with Error-based Human Assessment
Figure 3 for An In-depth Evaluation of GPT-4 in Sentence Simplification with Error-based Human Assessment
Figure 4 for An In-depth Evaluation of GPT-4 in Sentence Simplification with Error-based Human Assessment
Viaarxiv icon