Picture for Akshay Krishnamurthy

Akshay Krishnamurthy

Carnegie Mellon University

The Role of Environment Access in Agnostic Reinforcement Learning

Add code
Apr 07, 2025
Viaarxiv icon

Is Best-of-N the Best of Them? Coverage, Scaling, and Optimality in Inference-Time Alignment

Add code
Mar 27, 2025
Viaarxiv icon

Computational-Statistical Tradeoffs at the Next-Token Prediction Barrier: Autoregressive and Imitation Learning under Misspecification

Add code
Feb 18, 2025
Viaarxiv icon

Self-Improvement in Language Models: The Sharpening Mechanism

Add code
Dec 02, 2024
Viaarxiv icon

Reinforcement Learning under Latent Dynamics: Toward Statistical and Algorithmic Modularity

Add code
Oct 23, 2024
Figure 1 for Reinforcement Learning under Latent Dynamics: Toward Statistical and Algorithmic Modularity
Viaarxiv icon

Correcting the Mythos of KL-Regularization: Direct Alignment without Overparameterization via Chi-squared Preference Optimization

Add code
Jul 18, 2024
Figure 1 for Correcting the Mythos of KL-Regularization: Direct Alignment without Overparameterization via Chi-squared Preference Optimization
Figure 2 for Correcting the Mythos of KL-Regularization: Direct Alignment without Overparameterization via Chi-squared Preference Optimization
Figure 3 for Correcting the Mythos of KL-Regularization: Direct Alignment without Overparameterization via Chi-squared Preference Optimization
Viaarxiv icon

Computationally Efficient RL under Linear Bellman Completeness for Deterministic Dynamics

Add code
Jun 17, 2024
Figure 1 for Computationally Efficient RL under Linear Bellman Completeness for Deterministic Dynamics
Viaarxiv icon

Exploratory Preference Optimization: Harnessing Implicit Q*-Approximation for Sample-Efficient RLHF

Add code
May 31, 2024
Figure 1 for Exploratory Preference Optimization: Harnessing Implicit Q*-Approximation for Sample-Efficient RLHF
Figure 2 for Exploratory Preference Optimization: Harnessing Implicit Q*-Approximation for Sample-Efficient RLHF
Viaarxiv icon

Rich-Observation Reinforcement Learning with Continuous Latent Dynamics

Add code
May 29, 2024
Viaarxiv icon

Can large language models explore in-context?

Add code
Mar 22, 2024
Figure 1 for Can large language models explore in-context?
Figure 2 for Can large language models explore in-context?
Figure 3 for Can large language models explore in-context?
Figure 4 for Can large language models explore in-context?
Viaarxiv icon