Picture for Aditya Gopalan

Aditya Gopalan

Towards Reliable Alignment: Uncertainty-aware RLHF

Add code
Oct 31, 2024
Figure 1 for Towards Reliable Alignment: Uncertainty-aware RLHF
Figure 2 for Towards Reliable Alignment: Uncertainty-aware RLHF
Figure 3 for Towards Reliable Alignment: Uncertainty-aware RLHF
Figure 4 for Towards Reliable Alignment: Uncertainty-aware RLHF
Viaarxiv icon

Testing the Feasibility of Linear Programs with Bandit Feedback

Add code
Jun 21, 2024
Viaarxiv icon

When are Bandits Robust to Misspecification?

Add code
Oct 13, 2023
Viaarxiv icon

A Unified Framework for Discovering Discrete Symmetries

Add code
Sep 06, 2023
Viaarxiv icon

On the Minimax Regret for Linear Bandits in a wide variety of Action Spaces

Add code
Jan 09, 2023
Viaarxiv icon

Exploration in Linear Bandits with Rich Action Sets and its Implications for Inference

Add code
Jul 23, 2022
Figure 1 for Exploration in Linear Bandits with Rich Action Sets and its Implications for Inference
Figure 2 for Exploration in Linear Bandits with Rich Action Sets and its Implications for Inference
Figure 3 for Exploration in Linear Bandits with Rich Action Sets and its Implications for Inference
Viaarxiv icon

Actor-Critic based Improper Reinforcement Learning

Add code
Jul 19, 2022
Figure 1 for Actor-Critic based Improper Reinforcement Learning
Figure 2 for Actor-Critic based Improper Reinforcement Learning
Figure 3 for Actor-Critic based Improper Reinforcement Learning
Figure 4 for Actor-Critic based Improper Reinforcement Learning
Viaarxiv icon

Approximate Q-learning and SARSA under the $ε$-greedy Policy: a Differential Inclusion Analysis

Add code
May 26, 2022
Figure 1 for Approximate Q-learning and SARSA under the $ε$-greedy Policy: a Differential Inclusion Analysis
Figure 2 for Approximate Q-learning and SARSA under the $ε$-greedy Policy: a Differential Inclusion Analysis
Figure 3 for Approximate Q-learning and SARSA under the $ε$-greedy Policy: a Differential Inclusion Analysis
Viaarxiv icon

Adaptive Estimation of Random Vectors with Bandit Feedback

Add code
Apr 01, 2022
Figure 1 for Adaptive Estimation of Random Vectors with Bandit Feedback
Figure 2 for Adaptive Estimation of Random Vectors with Bandit Feedback
Figure 3 for Adaptive Estimation of Random Vectors with Bandit Feedback
Figure 4 for Adaptive Estimation of Random Vectors with Bandit Feedback
Viaarxiv icon

Bregman Deviations of Generic Exponential Families

Add code
Jan 18, 2022
Figure 1 for Bregman Deviations of Generic Exponential Families
Figure 2 for Bregman Deviations of Generic Exponential Families
Figure 3 for Bregman Deviations of Generic Exponential Families
Figure 4 for Bregman Deviations of Generic Exponential Families
Viaarxiv icon