Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Unified PAC-Bayesian Study of Pessimism for Offline Policy Learning with Regularized Importance Sampling

Jun 05, 2024

Imad Aouali, Victor-Emmanuel Brunel, David Rohde, Anna Korba

Figure 1 for Unified PAC-Bayesian Study of Pessimism for Offline Policy Learning with Regularized Importance Sampling

Figure 2 for Unified PAC-Bayesian Study of Pessimism for Offline Policy Learning with Regularized Importance Sampling

Figure 3 for Unified PAC-Bayesian Study of Pessimism for Offline Policy Learning with Regularized Importance Sampling

Figure 4 for Unified PAC-Bayesian Study of Pessimism for Offline Policy Learning with Regularized Importance Sampling

Share this with someone who'll enjoy it:

Abstract:Off-policy learning (OPL) often involves minimizing a risk estimator based on importance weighting to correct bias from the logging policy used to collect data. However, this method can produce an estimator with a high variance. A common solution is to regularize the importance weights and learn the policy by minimizing an estimator with penalties derived from generalization bounds specific to the estimator. This approach, known as pessimism, has gained recent attention but lacks a unified framework for analysis. To address this gap, we introduce a comprehensive PAC-Bayesian framework to examine pessimism with regularized importance weighting. We derive a tractable PAC-Bayesian generalization bound that universally applies to common importance weight regularizations, enabling their comparison within a single framework. Our empirical results challenge common understanding, demonstrating the effectiveness of standard IW regularization techniques.

* Accepted at UAI 2024

View paper on

Share this with someone who'll enjoy it:

Title:Unified PAC-Bayesian Study of Pessimism for Offline Policy Learning with Regularized Importance Sampling

Paper and Code