Picture for Zeyu Cui

Zeyu Cui

additional authors not shown

CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings

Add code
Jan 03, 2025
Viaarxiv icon

Qwen2.5 Technical Report

Add code
Dec 19, 2024
Viaarxiv icon

ExecRepoBench: Multi-level Executable Code Completion Evaluation

Add code
Dec 16, 2024
Figure 1 for ExecRepoBench: Multi-level Executable Code Completion Evaluation
Figure 2 for ExecRepoBench: Multi-level Executable Code Completion Evaluation
Figure 3 for ExecRepoBench: Multi-level Executable Code Completion Evaluation
Figure 4 for ExecRepoBench: Multi-level Executable Code Completion Evaluation
Viaarxiv icon

Evaluating and Aligning CodeLLMs on Human Preference

Add code
Dec 06, 2024
Figure 1 for Evaluating and Aligning CodeLLMs on Human Preference
Figure 2 for Evaluating and Aligning CodeLLMs on Human Preference
Figure 3 for Evaluating and Aligning CodeLLMs on Human Preference
Figure 4 for Evaluating and Aligning CodeLLMs on Human Preference
Viaarxiv icon

Enhancing LLMs for Power System Simulations: A Feedback-driven Multi-agent Framework

Add code
Nov 21, 2024
Viaarxiv icon

Qwen2.5-Coder Technical Report

Add code
Sep 18, 2024
Figure 1 for Qwen2.5-Coder Technical Report
Figure 2 for Qwen2.5-Coder Technical Report
Figure 3 for Qwen2.5-Coder Technical Report
Figure 4 for Qwen2.5-Coder Technical Report
Viaarxiv icon

Towards a Unified View of Preference Learning for Large Language Models: A Survey

Add code
Sep 04, 2024
Figure 1 for Towards a Unified View of Preference Learning for Large Language Models: A Survey
Figure 2 for Towards a Unified View of Preference Learning for Large Language Models: A Survey
Figure 3 for Towards a Unified View of Preference Learning for Large Language Models: A Survey
Figure 4 for Towards a Unified View of Preference Learning for Large Language Models: A Survey
Viaarxiv icon

Qwen2 Technical Report

Add code
Jul 16, 2024
Figure 1 for Qwen2 Technical Report
Figure 2 for Qwen2 Technical Report
Figure 3 for Qwen2 Technical Report
Figure 4 for Qwen2 Technical Report
Viaarxiv icon

Enabling Large Language Models to Perform Power System Simulations with Previously Unseen Tools: A Case of Daline

Add code
Jun 26, 2024
Viaarxiv icon

Qwen Technical Report

Add code
Sep 28, 2023
Figure 1 for Qwen Technical Report
Figure 2 for Qwen Technical Report
Figure 3 for Qwen Technical Report
Figure 4 for Qwen Technical Report
Viaarxiv icon