Picture for Kishan Panaganti

Kishan Panaganti

Evolving Language Models without Labels: Majority Drives Selection, Novelty Promotes Variation

Add code
Sep 18, 2025
Viaarxiv icon

TARDIS STRIDE: A Spatio-Temporal Road Image Dataset for Exploration and Autonomy

Add code
Jun 12, 2025
Viaarxiv icon

Efficient Policy Optimization in Robust Constrained MDPs with Iteration Complexity Guarantees

Add code
May 25, 2025
Viaarxiv icon

KL-regularization Itself is Differentially Private in Bandits and RLHF

Add code
May 23, 2025
Viaarxiv icon

Distributionally Robust Direct Preference Optimization

Add code
Feb 04, 2025
Figure 1 for Distributionally Robust Direct Preference Optimization
Figure 2 for Distributionally Robust Direct Preference Optimization
Figure 3 for Distributionally Robust Direct Preference Optimization
Figure 4 for Distributionally Robust Direct Preference Optimization
Viaarxiv icon

Hybrid Transfer Reinforcement Learning: Provable Sample Efficiency from Shifted-Dynamics Data

Add code
Nov 06, 2024
Figure 1 for Hybrid Transfer Reinforcement Learning: Provable Sample Efficiency from Shifted-Dynamics Data
Figure 2 for Hybrid Transfer Reinforcement Learning: Provable Sample Efficiency from Shifted-Dynamics Data
Figure 3 for Hybrid Transfer Reinforcement Learning: Provable Sample Efficiency from Shifted-Dynamics Data
Figure 4 for Hybrid Transfer Reinforcement Learning: Provable Sample Efficiency from Shifted-Dynamics Data
Viaarxiv icon

Distributionally Robust Constrained Reinforcement Learning under Strong Duality

Add code
Jun 22, 2024
Figure 1 for Distributionally Robust Constrained Reinforcement Learning under Strong Duality
Figure 2 for Distributionally Robust Constrained Reinforcement Learning under Strong Duality
Figure 3 for Distributionally Robust Constrained Reinforcement Learning under Strong Duality
Viaarxiv icon

Tractable Equilibrium Computation in Markov Games through Risk Aversion

Add code
Jun 20, 2024
Viaarxiv icon

Model-Free Robust $φ$-Divergence Reinforcement Learning Using Both Offline and Online Data

Add code
May 08, 2024
Figure 1 for Model-Free Robust $φ$-Divergence Reinforcement Learning Using Both Offline and Online Data
Viaarxiv icon

Bridging Distributionally Robust Learning and Offline RL: An Approach to Mitigate Distribution Shift and Partial Data Coverage

Add code
Oct 27, 2023
Figure 1 for Bridging Distributionally Robust Learning and Offline RL: An Approach to Mitigate Distribution Shift and Partial Data Coverage
Figure 2 for Bridging Distributionally Robust Learning and Offline RL: An Approach to Mitigate Distribution Shift and Partial Data Coverage
Figure 3 for Bridging Distributionally Robust Learning and Offline RL: An Approach to Mitigate Distribution Shift and Partial Data Coverage
Figure 4 for Bridging Distributionally Robust Learning and Offline RL: An Approach to Mitigate Distribution Shift and Partial Data Coverage
Viaarxiv icon