Picture for Rahul Jain

Rahul Jain

Distributionally Robust Direct Preference Optimization

Add code
Feb 04, 2025
Figure 1 for Distributionally Robust Direct Preference Optimization
Figure 2 for Distributionally Robust Direct Preference Optimization
Figure 3 for Distributionally Robust Direct Preference Optimization
Figure 4 for Distributionally Robust Direct Preference Optimization
Viaarxiv icon

Best Policy Learning from Trajectory Preference Feedback

Add code
Jan 31, 2025
Viaarxiv icon

Markov Balance Satisfaction Improves Performance in Strictly Batch Offline Imitation Learning

Add code
Aug 17, 2024
Viaarxiv icon

Online Bandit Learning with Offline Preference Data

Add code
Jun 13, 2024
Viaarxiv icon

e-COP : Episodic Constrained Optimization of Policies

Add code
Jun 13, 2024
Viaarxiv icon

Pure Exploration for Constrained Best Mixed Arm Identification with a Fixed Budget

Add code
May 23, 2024
Viaarxiv icon

Efficient Online Learning with Offline Datasets for Infinite Horizon MDPs: A Bayesian Approach

Add code
Oct 17, 2023
Viaarxiv icon

Regret Analysis of the Posterior Sampling-based Learning Algorithm for Episodic POMDPs

Add code
Oct 16, 2023
Viaarxiv icon

Conditional Kernel Imitation Learning for Continuous State Environments

Add code
Aug 24, 2023
Viaarxiv icon

Optimal Control of Logically Constrained Partially Observable and Multi-Agent Markov Decision Processes

Add code
May 24, 2023
Viaarxiv icon