Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Shiau Hong Lim

Neural-Progressive Hedging: Enforcing Constraints in Reinforcement Learning with Stochastic Programming

Feb 27, 2022

Supriyo Ghosh, Laura Wynter, Shiau Hong Lim, Duc Thien Nguyen

Figure 1 for Neural-Progressive Hedging: Enforcing Constraints in Reinforcement Learning with Stochastic Programming

Figure 2 for Neural-Progressive Hedging: Enforcing Constraints in Reinforcement Learning with Stochastic Programming

Figure 3 for Neural-Progressive Hedging: Enforcing Constraints in Reinforcement Learning with Stochastic Programming

Figure 4 for Neural-Progressive Hedging: Enforcing Constraints in Reinforcement Learning with Stochastic Programming

Abstract:We propose a framework, called neural-progressive hedging (NP), that leverages stochastic programming during the online phase of executing a reinforcement learning (RL) policy. The goal is to ensure feasibility with respect to constraints and risk-based objectives such as conditional value-at-risk (CVaR) during the execution of the policy, using probabilistic models of the state transitions to guide policy adjustments. The framework is particularly amenable to the class of sequential resource allocation problems since feasibility with respect to typical resource constraints cannot be enforced in a scalable manner. The NP framework provides an alternative that adds modest overhead during the online phase. Experimental results demonstrate the efficacy of the NP framework on two continuous real-world tasks: (i) the portfolio optimization problem with liquidity constraints for financial planning, characterized by non-stationary state distributions; and (ii) the dynamic repositioning problem in bike sharing systems, that embodies the class of supply-demand matching problems. We show that the NP framework produces policies that are better than deep RL and other baseline approaches, adapting to non-stationarity, whilst satisfying structural constraints and accommodating risk measures in the resulting policies. Additional benefits of the NP framework are ease of implementation and better explainability of the policies.

Via

Access Paper or Ask Questions

Order Constraints in Optimal Transport

Oct 14, 2021

Fabian Lim, Laura Wynter, Shiau Hong Lim

Figure 1 for Order Constraints in Optimal Transport

Figure 2 for Order Constraints in Optimal Transport

Figure 3 for Order Constraints in Optimal Transport

Abstract:Optimal transport is a framework for comparing measures whereby a cost is incurred for transporting one measure to another. Recent works have aimed to improve optimal transport plans through the introduction of various forms of structure. We introduce novel order constraints into the optimal transport formulation to allow for the incorporation of structure. While there will are now quadratically many constraints as before, we prove a $\delta-$approximate solution to the order-constrained optimal transport problem can be obtained in $\mathcal{O}(L^2\delta^{-2} \kappa(\delta(2cL_\infty (1+(mn)^{1/2}))^{-1}) \cdot mn\log mn)$ time. We derive computationally efficient lower bounds that allow for an explainable approach to adding structure to the optimal transport plan through order constraints. We demonstrate experimentally that order constraints improve explainability using the e-SNLI (Stanford Natural Language Inference) dataset that includes human-annotated rationales for each assignment.

* Preprint. 8 pages of main + 2 pages references, and 10 pages supplementary

Via

Access Paper or Ask Questions

Efficient Reinforcement Learning in Resource Allocation Problems Through Permutation Invariant Multi-task Learning

Feb 18, 2021

Desmond Cai, Shiau Hong Lim, Laura Wynter

Figure 1 for Efficient Reinforcement Learning in Resource Allocation Problems Through Permutation Invariant Multi-task Learning

Figure 2 for Efficient Reinforcement Learning in Resource Allocation Problems Through Permutation Invariant Multi-task Learning

Figure 3 for Efficient Reinforcement Learning in Resource Allocation Problems Through Permutation Invariant Multi-task Learning

Figure 4 for Efficient Reinforcement Learning in Resource Allocation Problems Through Permutation Invariant Multi-task Learning

Abstract:One of the main challenges in real-world reinforcement learning is to learn successfully from limited training samples. We show that in certain settings, the available data can be dramatically increased through a form of multi-task learning, by exploiting an invariance property in the tasks. We provide a theoretical performance bound for the gain in sample efficiency under this setting. This motivates a new approach to multi-task learning, which involves the design of an appropriate neural network architecture and a prioritized task-sampling strategy. We demonstrate empirically the effectiveness of the proposed approach on two real-world sequential resource allocation tasks where this invariance property occurs: financial portfolio optimization and meta federated learning.

Via

Access Paper or Ask Questions

Fed+: A Family of Fusion Algorithms for Federated Learning

Sep 14, 2020

Pengqian Yu, Laura Wynter, Shiau Hong Lim

Figure 1 for Fed+: A Family of Fusion Algorithms for Federated Learning

Figure 2 for Fed+: A Family of Fusion Algorithms for Federated Learning

Figure 3 for Fed+: A Family of Fusion Algorithms for Federated Learning

Figure 4 for Fed+: A Family of Fusion Algorithms for Federated Learning

Abstract:We present a class of methods for federated learning, which we call Fed+, pronounced FedPlus. The class of methods encompasses and unifies a number of recent algorithms proposed for federated learning and permits easily defining many new algorithms. The principal advantage of this class of methods is to better accommodate the real-world characteristics found in federated learning training, such as the lack of IID data across the parties in the federation. We demonstrate the use and benefits of this class of algorithms on standard benchmark datasets and a challenging real-world problem where catastrophic failure has a serious impact, namely in financial portfolio management.

Via

Access Paper or Ask Questions

Variational Bayesian Inference for Crowdsourcing Predictions

Jun 02, 2020

Desmond Cai, Duc Thien Nguyen, Shiau Hong Lim, Laura Wynter

Figure 1 for Variational Bayesian Inference for Crowdsourcing Predictions

Figure 2 for Variational Bayesian Inference for Crowdsourcing Predictions

Abstract:Crowdsourcing has emerged as an effective means for performing a number of machine learning tasks such as annotation and labelling of images and other data sets. In most early settings of crowdsourcing, the task involved classification, that is assigning one of a discrete set of labels to each task. Recently, however, more complex tasks have been attempted including asking crowdsource workers to assign continuous labels, or predictions. In essence, this involves the use of crowdsourcing for function estimation. We are motivated by this problem to drive applications such as collaborative prediction, that is, harnessing the wisdom of the crowd to predict quantities more accurately. To do so, we propose a Bayesian approach aimed specifically at alleviating overfitting, a typical impediment to accurate prediction models in practice. In particular, we develop a variational Bayesian technique for two different worker noise models - one that assumes workers' noises are independent and the other that assumes workers' noises have a latent low-rank structure. Our evaluations on synthetic and real-world datasets demonstrate that these Bayesian approaches perform significantly better than existing non-Bayesian approaches and are thus potentially useful for this class of crowdsourcing problems.

* 7 pages

Via

Access Paper or Ask Questions

A Deep Ensemble Multi-Agent Reinforcement Learning Approach for Air Traffic Control

Apr 03, 2020

Supriyo Ghosh, Sean Laguna, Shiau Hong Lim, Laura Wynter, Hasan Poonawala

Figure 1 for A Deep Ensemble Multi-Agent Reinforcement Learning Approach for Air Traffic Control

Figure 2 for A Deep Ensemble Multi-Agent Reinforcement Learning Approach for Air Traffic Control

Figure 3 for A Deep Ensemble Multi-Agent Reinforcement Learning Approach for Air Traffic Control

Figure 4 for A Deep Ensemble Multi-Agent Reinforcement Learning Approach for Air Traffic Control

Abstract:Air traffic control is an example of a highly challenging operational problem that is readily amenable to human expertise augmentation via decision support technologies. In this paper, we propose a new intelligent decision making framework that leverages multi-agent reinforcement learning (MARL) to dynamically suggest adjustments of aircraft speeds in real-time. The goal of the system is to enhance the ability of an air traffic controller to provide effective guidance to aircraft to avoid air traffic congestion, near-miss situations, and to improve arrival timeliness. We develop a novel deep ensemble MARL method that can concisely capture the complexity of the air traffic control problem by learning to efficiently arbitrate between the decisions of a local kernel-based RL model and a wider-reaching deep MARL model. The proposed method is trained and evaluated on an open-source air traffic management simulator developed by Eurocontrol. Extensive empirical results on a real-world dataset including thousands of aircraft demonstrate the feasibility of using multi-agent RL for the problem of en-route air traffic control and show that our proposed deep ensemble MARL method significantly outperforms three state-of-the-art benchmark approaches.

Via

Access Paper or Ask Questions

Noisy Search with Comparative Feedback

Feb 14, 2012

Shiau Hong Lim, Peter Auer

Figure 1 for Noisy Search with Comparative Feedback

Figure 2 for Noisy Search with Comparative Feedback

Abstract:We present theoretical results in terms of lower and upper bounds on the query complexity of noisy search with comparative feedback. In this search model, the noise in the feedback depends on the distance between query points and the search target. Consequently, the error probability in the feedback is not fixed but varies for the queries posed by the search algorithm. Our results show that a target out of n items can be found in O(log n) queries. We also show the surprising result that for k possible answers per query, the speedup is not log k (as for k-ary search) but only log log k in some cases.

Via

Access Paper or Ask Questions