Picture for Yuzhuo Bai

Yuzhuo Bai

GLTW: Joint Improved Graph Transformer and LLM via Three-Word Language for Knowledge Graph Completion

Add code
Feb 17, 2025
Viaarxiv icon

Aligning Large Language Models to Follow Instructions and Hallucinate Less via Effective Data Filtering

Add code
Feb 11, 2025
Viaarxiv icon

Value Compass Leaderboard: A Platform for Fundamental and Validated Evaluation of LLMs Values

Add code
Jan 13, 2025
Viaarxiv icon

OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scientific Problems

Add code
Feb 21, 2024
Figure 1 for OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scientific Problems
Figure 2 for OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scientific Problems
Figure 3 for OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scientific Problems
Figure 4 for OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scientific Problems
Viaarxiv icon

C-Eval: A Multi-Level Multi-Discipline Chinese Evaluation Suite for Foundation Models

Add code
May 17, 2023
Figure 1 for C-Eval: A Multi-Level Multi-Discipline Chinese Evaluation Suite for Foundation Models
Figure 2 for C-Eval: A Multi-Level Multi-Discipline Chinese Evaluation Suite for Foundation Models
Figure 3 for C-Eval: A Multi-Level Multi-Discipline Chinese Evaluation Suite for Foundation Models
Figure 4 for C-Eval: A Multi-Level Multi-Discipline Chinese Evaluation Suite for Foundation Models
Viaarxiv icon

Manual Evaluation Matters: Reviewing Test Protocols of Distantly Supervised Relation Extraction

Add code
May 20, 2021
Figure 1 for Manual Evaluation Matters: Reviewing Test Protocols of Distantly Supervised Relation Extraction
Figure 2 for Manual Evaluation Matters: Reviewing Test Protocols of Distantly Supervised Relation Extraction
Figure 3 for Manual Evaluation Matters: Reviewing Test Protocols of Distantly Supervised Relation Extraction
Figure 4 for Manual Evaluation Matters: Reviewing Test Protocols of Distantly Supervised Relation Extraction
Viaarxiv icon