Picture for Lazaros Gallos

Lazaros Gallos

Reasoning or Simply Next Token Prediction? A Benchmark for Stress-Testing Large Language Models

Add code
Jun 15, 2024
Viaarxiv icon