Picture for Leshem Choshen

Leshem Choshen

Robustness as an Emergent Property of Task Performance

Add code
Feb 03, 2026
Viaarxiv icon

CommonLID: Re-evaluating State-of-the-Art Language Identification Performance on Web Data

Add code
Jan 25, 2026
Viaarxiv icon

ErrorMap and ErrorAtlas: Charting the Failure Landscape of Large Language Models

Add code
Jan 22, 2026
Viaarxiv icon

Will it Merge? On The Causes of Model Mergeability

Add code
Jan 10, 2026
Viaarxiv icon

Beyond Binary Rewards: Training LMs to Reason About Their Uncertainty

Add code
Jul 22, 2025
Viaarxiv icon

LLM Hypnosis: Exploiting User Feedback for Unauthorized Knowledge Injection to All Users

Add code
Jul 03, 2025
Viaarxiv icon

Can Gradient Descent Simulate Prompting?

Add code
Jun 26, 2025
Viaarxiv icon

TextArena

Add code
Apr 15, 2025
Figure 1 for TextArena
Figure 2 for TextArena
Figure 3 for TextArena
Figure 4 for TextArena
Viaarxiv icon

Findings of the BabyLM Challenge: Sample-Efficient Pretraining on Developmentally Plausible Corpora

Add code
Apr 10, 2025
Figure 1 for Findings of the BabyLM Challenge: Sample-Efficient Pretraining on Developmentally Plausible Corpora
Figure 2 for Findings of the BabyLM Challenge: Sample-Efficient Pretraining on Developmentally Plausible Corpora
Figure 3 for Findings of the BabyLM Challenge: Sample-Efficient Pretraining on Developmentally Plausible Corpora
Figure 4 for Findings of the BabyLM Challenge: Sample-Efficient Pretraining on Developmentally Plausible Corpora
Viaarxiv icon

Pretraining Language Models for Diachronic Linguistic Change Discovery

Add code
Apr 09, 2025
Figure 1 for Pretraining Language Models for Diachronic Linguistic Change Discovery
Figure 2 for Pretraining Language Models for Diachronic Linguistic Change Discovery
Figure 3 for Pretraining Language Models for Diachronic Linguistic Change Discovery
Figure 4 for Pretraining Language Models for Diachronic Linguistic Change Discovery
Viaarxiv icon