Picture for Binyuan Hui

Binyuan Hui

additional authors not shown

CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings

Add code
Jan 03, 2025
Viaarxiv icon

Qwen2.5 Technical Report

Add code
Dec 19, 2024
Viaarxiv icon

ExecRepoBench: Multi-level Executable Code Completion Evaluation

Add code
Dec 16, 2024
Figure 1 for ExecRepoBench: Multi-level Executable Code Completion Evaluation
Figure 2 for ExecRepoBench: Multi-level Executable Code Completion Evaluation
Figure 3 for ExecRepoBench: Multi-level Executable Code Completion Evaluation
Figure 4 for ExecRepoBench: Multi-level Executable Code Completion Evaluation
Viaarxiv icon

Evaluating and Aligning CodeLLMs on Human Preference

Add code
Dec 06, 2024
Figure 1 for Evaluating and Aligning CodeLLMs on Human Preference
Figure 2 for Evaluating and Aligning CodeLLMs on Human Preference
Figure 3 for Evaluating and Aligning CodeLLMs on Human Preference
Figure 4 for Evaluating and Aligning CodeLLMs on Human Preference
Viaarxiv icon

Qwen2.5-Coder Technical Report

Add code
Sep 18, 2024
Viaarxiv icon

Qwen2.5-Math Technical Report: Toward Mathematical Expert Model via Self-Improvement

Add code
Sep 18, 2024
Viaarxiv icon

OLMoE: Open Mixture-of-Experts Language Models

Add code
Sep 03, 2024
Figure 1 for OLMoE: Open Mixture-of-Experts Language Models
Figure 2 for OLMoE: Open Mixture-of-Experts Language Models
Figure 3 for OLMoE: Open Mixture-of-Experts Language Models
Figure 4 for OLMoE: Open Mixture-of-Experts Language Models
Viaarxiv icon

Synthesizing Text-to-SQL Data from Weak and Strong LLMs

Add code
Aug 06, 2024
Figure 1 for Synthesizing Text-to-SQL Data from Weak and Strong LLMs
Figure 2 for Synthesizing Text-to-SQL Data from Weak and Strong LLMs
Figure 3 for Synthesizing Text-to-SQL Data from Weak and Strong LLMs
Figure 4 for Synthesizing Text-to-SQL Data from Weak and Strong LLMs
Viaarxiv icon

OpenDevin: An Open Platform for AI Software Developers as Generalist Agents

Add code
Jul 23, 2024
Figure 1 for OpenDevin: An Open Platform for AI Software Developers as Generalist Agents
Figure 2 for OpenDevin: An Open Platform for AI Software Developers as Generalist Agents
Figure 3 for OpenDevin: An Open Platform for AI Software Developers as Generalist Agents
Figure 4 for OpenDevin: An Open Platform for AI Software Developers as Generalist Agents
Viaarxiv icon

Qwen2 Technical Report

Add code
Jul 16, 2024
Figure 1 for Qwen2 Technical Report
Figure 2 for Qwen2 Technical Report
Figure 3 for Qwen2 Technical Report
Figure 4 for Qwen2 Technical Report
Viaarxiv icon