Picture for Zhuoer Feng

Zhuoer Feng

AlignBench: Benchmarking Chinese Alignment of Large Language Models

Add code
Dec 05, 2023
Viaarxiv icon

CritiqueLLM: Scaling LLM-as-Critic for Effective and Explainable Evaluation of Large Language Model Generation

Add code
Nov 30, 2023
Viaarxiv icon

LOT: A Benchmark for Evaluating Chinese Long Text Understanding and Generation

Add code
Aug 30, 2021
Figure 1 for LOT: A Benchmark for Evaluating Chinese Long Text Understanding and Generation
Figure 2 for LOT: A Benchmark for Evaluating Chinese Long Text Understanding and Generation
Figure 3 for LOT: A Benchmark for Evaluating Chinese Long Text Understanding and Generation
Figure 4 for LOT: A Benchmark for Evaluating Chinese Long Text Understanding and Generation
Viaarxiv icon

OpenMEVA: A Benchmark for Evaluating Open-ended Story Generation Metrics

Add code
May 19, 2021
Figure 1 for OpenMEVA: A Benchmark for Evaluating Open-ended Story Generation Metrics
Figure 2 for OpenMEVA: A Benchmark for Evaluating Open-ended Story Generation Metrics
Figure 3 for OpenMEVA: A Benchmark for Evaluating Open-ended Story Generation Metrics
Figure 4 for OpenMEVA: A Benchmark for Evaluating Open-ended Story Generation Metrics
Viaarxiv icon