Picture for Wenting Zhao

Wenting Zhao

Scaling Agentic Verifier for Competitive Coding

Add code
Feb 04, 2026
Viaarxiv icon

SWE-Universe: Scale Real-World Verifiable Environments to Millions

Add code
Feb 02, 2026
Viaarxiv icon

SweRank+: Multilingual, Multi-Turn Code Ranking for Software Issue Localization

Add code
Dec 23, 2025
Viaarxiv icon

Self-Abstraction from Grounded Experience for Plan-Guided Policy Refinement

Add code
Nov 08, 2025
Figure 1 for Self-Abstraction from Grounded Experience for Plan-Guided Policy Refinement
Figure 2 for Self-Abstraction from Grounded Experience for Plan-Guided Policy Refinement
Figure 3 for Self-Abstraction from Grounded Experience for Plan-Guided Policy Refinement
Figure 4 for Self-Abstraction from Grounded Experience for Plan-Guided Policy Refinement
Viaarxiv icon

StepWiser: Stepwise Generative Judges for Wiser Reasoning

Add code
Aug 27, 2025
Figure 1 for StepWiser: Stepwise Generative Judges for Wiser Reasoning
Figure 2 for StepWiser: Stepwise Generative Judges for Wiser Reasoning
Figure 3 for StepWiser: Stepwise Generative Judges for Wiser Reasoning
Figure 4 for StepWiser: Stepwise Generative Judges for Wiser Reasoning
Viaarxiv icon

Towards LLM Agents for Earth Observation

Add code
Apr 16, 2025
Viaarxiv icon

Multi-Turn Code Generation Through Single-Step Rewards

Add code
Feb 27, 2025
Viaarxiv icon

ProjectTest: A Project-level LLM Unit Test Generation Benchmark and Impact of Error Fixing Mechanisms

Add code
Feb 11, 2025
Figure 1 for ProjectTest: A Project-level LLM Unit Test Generation Benchmark and Impact of Error Fixing Mechanisms
Figure 2 for ProjectTest: A Project-level LLM Unit Test Generation Benchmark and Impact of Error Fixing Mechanisms
Figure 3 for ProjectTest: A Project-level LLM Unit Test Generation Benchmark and Impact of Error Fixing Mechanisms
Figure 4 for ProjectTest: A Project-level LLM Unit Test Generation Benchmark and Impact of Error Fixing Mechanisms
Viaarxiv icon

ProjectTest: A Project-level Unit Test Generation Benchmark and Impact of Error Fixing Mechanisms

Add code
Feb 10, 2025
Figure 1 for ProjectTest: A Project-level Unit Test Generation Benchmark and Impact of Error Fixing Mechanisms
Figure 2 for ProjectTest: A Project-level Unit Test Generation Benchmark and Impact of Error Fixing Mechanisms
Figure 3 for ProjectTest: A Project-level Unit Test Generation Benchmark and Impact of Error Fixing Mechanisms
Figure 4 for ProjectTest: A Project-level Unit Test Generation Benchmark and Impact of Error Fixing Mechanisms
Viaarxiv icon

Commit0: Library Generation from Scratch

Add code
Dec 02, 2024
Viaarxiv icon