Picture for Marthe Ballon

Marthe Ballon

Probing the Trajectories of Reasoning Traces in Large Language Models

Add code
Jan 30, 2026
Viaarxiv icon

Benchmarks Saturate When The Model Gets Smarter Than The Judge

Add code
Jan 27, 2026
Viaarxiv icon

Estimating problem difficulty without ground truth using Large Language Model comparisons

Add code
Dec 16, 2025
Viaarxiv icon

The Relationship Between Reasoning and Performance in Large Language Models -- o3 (mini) Thinks Harder, Not Longer

Add code
Feb 21, 2025
Viaarxiv icon