Picture for Paria Rashidinejad

Paria Rashidinejad

Sail into the Headwind: Alignment via Robust Rewards and Dynamic Labels against Reward Hacking

Add code
Dec 12, 2024
Viaarxiv icon

Importance Weighted Actor-Critic for Optimal Conservative Offline Reinforcement Learning

Add code
Jan 30, 2023
Viaarxiv icon

Optimal Conservative Offline RL with General Function Approximation via Augmented Lagrangian

Add code
Nov 01, 2022
Viaarxiv icon

MADE: Exploration via Maximizing Deviation from Explored Regions

Add code
Jun 18, 2021
Figure 1 for MADE: Exploration via Maximizing Deviation from Explored Regions
Figure 2 for MADE: Exploration via Maximizing Deviation from Explored Regions
Figure 3 for MADE: Exploration via Maximizing Deviation from Explored Regions
Figure 4 for MADE: Exploration via Maximizing Deviation from Explored Regions
Viaarxiv icon

Bridging Offline Reinforcement Learning and Imitation Learning: A Tale of Pessimism

Add code
Mar 22, 2021
Figure 1 for Bridging Offline Reinforcement Learning and Imitation Learning: A Tale of Pessimism
Figure 2 for Bridging Offline Reinforcement Learning and Imitation Learning: A Tale of Pessimism
Figure 3 for Bridging Offline Reinforcement Learning and Imitation Learning: A Tale of Pessimism
Figure 4 for Bridging Offline Reinforcement Learning and Imitation Learning: A Tale of Pessimism
Viaarxiv icon

SLIP: Learning to Predict in Unknown Dynamical Systems with Long-Term Memory

Add code
Oct 12, 2020
Figure 1 for SLIP: Learning to Predict in Unknown Dynamical Systems with Long-Term Memory
Figure 2 for SLIP: Learning to Predict in Unknown Dynamical Systems with Long-Term Memory
Figure 3 for SLIP: Learning to Predict in Unknown Dynamical Systems with Long-Term Memory
Viaarxiv icon