Picture for Jiarong Wu

Jiarong Wu

Isolating Language-Coding from Problem-Solving: Benchmarking LLMs with PseudoEval

Add code
Feb 26, 2025
Viaarxiv icon

Can AI Beat Undergraduates in Entry-level Java Assignments? Benchmarking Large Language Models on JavaBench

Add code
Jun 10, 2024
Viaarxiv icon