Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Vikranth R. Dwaracherla

Parameterized Indexed Value Function for Efficient Exploration in Reinforcement Learning

Dec 23, 2019

Tian Tan, Zhihan Xiong, Vikranth R. Dwaracherla

Figure 1 for Parameterized Indexed Value Function for Efficient Exploration in Reinforcement Learning

Figure 2 for Parameterized Indexed Value Function for Efficient Exploration in Reinforcement Learning

Figure 3 for Parameterized Indexed Value Function for Efficient Exploration in Reinforcement Learning

Figure 4 for Parameterized Indexed Value Function for Efficient Exploration in Reinforcement Learning

Abstract:It is well known that quantifying uncertainty in the action-value estimates is crucial for efficient exploration in reinforcement learning. Ensemble sampling offers a relatively computationally tractable way of doing this using randomized value functions. However, it still requires a huge amount of computational resources for complex problems. In this paper, we present an alternative, computationally efficient way to induce exploration using index sampling. We use an indexed value function to represent uncertainty in our action-value estimates. We first present an algorithm to learn parameterized indexed value function through a distributional version of temporal difference in a tabular setting and prove its regret bound. Then, in a computational point of view, we propose a dual-network architecture, Parameterized Indexed Networks (PINs), comprising one mean network and one uncertainty network to learn the indexed value function. Finally, we show the efficacy of PINs through computational experiments.

* 17 pages, 4 figures, Proceedings of the 34th AAAI Conference on Artificial Intelligence

Via

Access Paper or Ask Questions

Gradient Estimation with Simultaneous Perturbation and Compressive Sensing

Jul 26, 2016

Vivek S. Borkar, Vikranth R. Dwaracherla, Neeraja Sahasrabudhe

Figure 1 for Gradient Estimation with Simultaneous Perturbation and Compressive Sensing

Figure 2 for Gradient Estimation with Simultaneous Perturbation and Compressive Sensing

Figure 3 for Gradient Estimation with Simultaneous Perturbation and Compressive Sensing

Figure 4 for Gradient Estimation with Simultaneous Perturbation and Compressive Sensing

Abstract:This paper aims at achieving a "good" estimator for the gradient of a function on a high-dimensional space. Often such functions are not sensitive in all coordinates and the gradient of the function is almost sparse. We propose a method for gradient estimation that combines ideas from Spall's Simultaneous Perturbation Stochastic Approximation with compressive sensing. The aim is to obtain "good" estimator without too many function evaluations. Application to estimating gradient outer product matrix as well as standard optimization problems are illustrated via simulations.

* 24 pages, 13 figures

Via

Access Paper or Ask Questions