Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Tan Zhu

An Efficient Algorithm for Deep Stochastic Contextual Bandits

Apr 22, 2021

Tan Zhu, Guannan Liang, Chunjiang Zhu, Haining Li, Jinbo Bi

Figure 1 for An Efficient Algorithm for Deep Stochastic Contextual Bandits

Figure 2 for An Efficient Algorithm for Deep Stochastic Contextual Bandits

Figure 3 for An Efficient Algorithm for Deep Stochastic Contextual Bandits

Figure 4 for An Efficient Algorithm for Deep Stochastic Contextual Bandits

Abstract:In stochastic contextual bandit (SCB) problems, an agent selects an action based on certain observed context to maximize the cumulative reward over iterations. Recently there have been a few studies using a deep neural network (DNN) to predict the expected reward for an action, and the DNN is trained by a stochastic gradient based method. However, convergence analysis has been greatly ignored to examine whether and where these methods converge. In this work, we formulate the SCB that uses a DNN reward function as a non-convex stochastic optimization problem, and design a stage-wise stochastic gradient descent algorithm to optimize the problem and determine the action policy. We prove that with high probability, the action sequence chosen by this algorithm converges to a greedy action policy respecting a local optimal reward function. Extensive experiments have been performed to demonstrate the effectiveness and efficiency of the proposed algorithm on multiple real-world datasets.

* Accepted by AAAI 2021 Appendix uploaded

Via

Access Paper or Ask Questions

Federated Nonconvex Sparse Learning

Dec 31, 2020

Qianqian Tong, Guannan Liang, Tan Zhu, Jinbo Bi

Figure 1 for Federated Nonconvex Sparse Learning

Figure 2 for Federated Nonconvex Sparse Learning

Figure 3 for Federated Nonconvex Sparse Learning

Figure 4 for Federated Nonconvex Sparse Learning

Abstract:Nonconvex sparse learning plays an essential role in many areas, such as signal processing and deep network compression. Iterative hard thresholding (IHT) methods are the state-of-the-art for nonconvex sparse learning due to their capability of recovering true support and scalability with large datasets. Theoretical analysis of IHT is currently based on centralized IID data. In realistic large-scale situations, however, data are distributed, hardly IID, and private to local edge computing devices. It is thus necessary to examine the property of IHT in federated settings, which update in parallel on local devices and communicate with a central server only once in a while without sharing local data. In this paper, we propose two IHT methods: Federated Hard Thresholding (Fed-HT) and Federated Iterative Hard Thresholding (FedIter-HT). We prove that both algorithms enjoy a linear convergence rate and have strong guarantees to recover the optimal sparse estimator, similar to traditional IHT methods, but now with decentralized non-IID data. Empirical results demonstrate that the Fed-HT and FedIter-HT outperform their competitor - a distributed IHT, in terms of decreasing the objective values with lower requirements on communication rounds and bandwidth.

Via

Access Paper or Ask Questions