Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Philipp Wissmann

Is Q-learning an Ill-posed Problem?

Feb 21, 2025

Philipp Wissmann, Daniel Hein, Steffen Udluft, Thomas Runkler

Abstract:This paper investigates the instability of Q-learning in continuous environments, a challenge frequently encountered by practitioners. Traditionally, this instability is attributed to bootstrapping and regression model errors. Using a representative reinforcement learning benchmark, we systematically examine the effects of bootstrapping and model inaccuracies by incrementally eliminating these potential error sources. Our findings reveal that even in relatively simple benchmarks, the fundamental task of Q-learning - iteratively learning a Q-function from policy-specific target values - can be inherently ill-posed and prone to failure. These insights cast doubt on the reliability of Q-learning as a universal solution for reinforcement learning problems.

* Accepted at ESANN 2025

Via

Access Paper or Ask Questions

Why long model-based rollouts are no reason for bad Q-value estimates

Jul 16, 2024

Philipp Wissmann, Daniel Hein, Steffen Udluft, Volker Tresp

Abstract:This paper explores the use of model-based offline reinforcement learning with long model rollouts. While some literature criticizes this approach due to compounding errors, many practitioners have found success in real-world applications. The paper aims to demonstrate that long rollouts do not necessarily result in exponentially growing errors and can actually produce better Q-value estimates than model-free methods. These findings can potentially enhance reinforcement learning techniques.

* Accepted at ESANN 2024

Via

Access Paper or Ask Questions