Picture for Marek Petrik

Marek Petrik

Q-learning for Quantile MDPs: A Decomposition, Performance, and Convergence Analysis

Add code
Oct 31, 2024
Viaarxiv icon

Stationary Policies are Optimal in Risk-averse Total-reward MDPs with EVaR

Add code
Aug 30, 2024
Viaarxiv icon

Solving Multi-Model MDPs by Coordinate Ascent and Dynamic Programming

Add code
Jul 08, 2024
Viaarxiv icon

Percentile Criterion Optimization in Offline Reinforcement Learning

Add code
Apr 07, 2024
Viaarxiv icon

Data Poisoning Attacks on Off-Policy Policy Evaluation Methods

Add code
Apr 06, 2024
Viaarxiv icon

A Convex Relaxation Approach to Bayesian Regret Minimization in Offline Bandits

Add code
Jun 02, 2023
Viaarxiv icon

On Dynamic Program Decompositions of Static Risk Measures

Add code
Apr 24, 2023
Viaarxiv icon

Reducing Blackwell and Average Optimality to Discounted MDPs via the Blackwell Discount Factor

Add code
Jan 31, 2023
Viaarxiv icon

On the Convergence of Policy Gradient in Robust MDPs

Add code
Dec 20, 2022
Viaarxiv icon

On the convex formulations of robust Markov decision processes

Add code
Sep 21, 2022
Figure 1 for On the convex formulations of robust Markov decision processes
Figure 2 for On the convex formulations of robust Markov decision processes
Figure 3 for On the convex formulations of robust Markov decision processes
Figure 4 for On the convex formulations of robust Markov decision processes
Viaarxiv icon