Picture for Graham Neubig

Graham Neubig

Carnegie Mellon University

Do LLMs Understand Your Translations? Evaluating Paragraph-level MT with Question Answering

Add code
Apr 10, 2025
Viaarxiv icon

Inducing Programmatic Skills for Agentic Tasks

Add code
Apr 09, 2025
Viaarxiv icon

SkillWeaver: Web Agents can Self-Improve by Discovering and Honing Skills

Add code
Apr 09, 2025
Viaarxiv icon

M-Prometheus: A Suite of Open Multilingual LLM Judges

Add code
Apr 07, 2025
Viaarxiv icon

Scaling Evaluation-time Compute with Reasoning Models as Process Evaluators

Add code
Mar 25, 2025
Viaarxiv icon

Overtrained Language Models Are Harder to Fine-Tune

Add code
Mar 24, 2025
Viaarxiv icon

Benchmarking Failures in Tool-Augmented Language Models

Add code
Mar 18, 2025
Viaarxiv icon

Efficient Many-Shot In-Context Learning with Dynamic Block-Sparse Attention

Add code
Mar 11, 2025
Viaarxiv icon

Not-Just-Scaling Laws: Towards a Better Understanding of the Downstream Impact of Language Model Design Decisions

Add code
Mar 05, 2025
Viaarxiv icon

ESPnet-SpeechLM: An Open Speech Language Model Toolkit

Add code
Feb 21, 2025
Viaarxiv icon