Picture for Akshay Krishnamurthy

Akshay Krishnamurthy

Carnegie Mellon University

Learning to Reason with Curriculum I: Provable Benefits of Autocurriculum

Add code
Mar 18, 2026
Viaarxiv icon

Reject, Resample, Repeat: Understanding Parallel Reasoning in Language Model Inference

Add code
Mar 09, 2026
Viaarxiv icon

A Unifying View of Coverage in Linear Off-Policy Evaluation

Add code
Jan 26, 2026
Viaarxiv icon

Wait, Wait, Wait... Why Do Reasoning Models Loop?

Add code
Dec 15, 2025
Viaarxiv icon

The Role of Environment Access in Agnostic Reinforcement Learning

Add code
Apr 07, 2025
Figure 1 for The Role of Environment Access in Agnostic Reinforcement Learning
Figure 2 for The Role of Environment Access in Agnostic Reinforcement Learning
Figure 3 for The Role of Environment Access in Agnostic Reinforcement Learning
Figure 4 for The Role of Environment Access in Agnostic Reinforcement Learning
Viaarxiv icon

Is Best-of-N the Best of Them? Coverage, Scaling, and Optimality in Inference-Time Alignment

Add code
Mar 27, 2025
Figure 1 for Is Best-of-N the Best of Them? Coverage, Scaling, and Optimality in Inference-Time Alignment
Figure 2 for Is Best-of-N the Best of Them? Coverage, Scaling, and Optimality in Inference-Time Alignment
Figure 3 for Is Best-of-N the Best of Them? Coverage, Scaling, and Optimality in Inference-Time Alignment
Figure 4 for Is Best-of-N the Best of Them? Coverage, Scaling, and Optimality in Inference-Time Alignment
Viaarxiv icon

Computational-Statistical Tradeoffs at the Next-Token Prediction Barrier: Autoregressive and Imitation Learning under Misspecification

Add code
Feb 18, 2025
Viaarxiv icon

Self-Improvement in Language Models: The Sharpening Mechanism

Add code
Dec 02, 2024
Figure 1 for Self-Improvement in Language Models: The Sharpening Mechanism
Figure 2 for Self-Improvement in Language Models: The Sharpening Mechanism
Figure 3 for Self-Improvement in Language Models: The Sharpening Mechanism
Figure 4 for Self-Improvement in Language Models: The Sharpening Mechanism
Viaarxiv icon

Reinforcement Learning under Latent Dynamics: Toward Statistical and Algorithmic Modularity

Add code
Oct 23, 2024
Figure 1 for Reinforcement Learning under Latent Dynamics: Toward Statistical and Algorithmic Modularity
Viaarxiv icon

Correcting the Mythos of KL-Regularization: Direct Alignment without Overparameterization via Chi-squared Preference Optimization

Add code
Jul 18, 2024
Figure 1 for Correcting the Mythos of KL-Regularization: Direct Alignment without Overparameterization via Chi-squared Preference Optimization
Figure 2 for Correcting the Mythos of KL-Regularization: Direct Alignment without Overparameterization via Chi-squared Preference Optimization
Figure 3 for Correcting the Mythos of KL-Regularization: Direct Alignment without Overparameterization via Chi-squared Preference Optimization
Viaarxiv icon