Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Rafal Wisniewski

Safe Reinforcement Learning for Constrained Markov Decision Processes with Stochastic Stopping Time

Mar 23, 2024

Abhijit Mazumdar, Rafal Wisniewski, Manuela L. Bujorianu

Abstract:In this paper, we present an online reinforcement learning algorithm for constrained Markov decision processes with a safety constraint. Despite the necessary attention of the scientific community, considering stochastic stopping time, the problem of learning optimal policy without violating safety constraints during the learning phase is yet to be addressed. To this end, we propose an algorithm based on linear programming that does not require a process model. We show that the learned policy is safe with high confidence. We also propose a method to compute a safe baseline policy, which is central in developing algorithms that do not violate the safety constraints. Finally, we provide simulation results to show the efficacy of the proposed algorithm. Further, we demonstrate that efficient exploration can be achieved by defining a subset of the state-space called proxy set.

Via

Access Paper or Ask Questions

PAC-Bayes Generalisation Bounds for Dynamical Systems Including Stable RNNs

Dec 15, 2023

Deividas Eringis, John Leth, Zheng-Hua Tan, Rafal Wisniewski, Mihaly Petreczky

Figure 1 for PAC-Bayes Generalisation Bounds for Dynamical Systems Including Stable RNNs

Figure 2 for PAC-Bayes Generalisation Bounds for Dynamical Systems Including Stable RNNs

Figure 3 for PAC-Bayes Generalisation Bounds for Dynamical Systems Including Stable RNNs

Abstract:In this paper, we derive a PAC-Bayes bound on the generalisation gap, in a supervised time-series setting for a special class of discrete-time non-linear dynamical systems. This class includes stable recurrent neural networks (RNN), and the motivation for this work was its application to RNNs. In order to achieve the results, we impose some stability constraints, on the allowed models. Here, stability is understood in the sense of dynamical systems. For RNNs, these stability conditions can be expressed in terms of conditions on the weights. We assume the processes involved are essentially bounded and the loss functions are Lipschitz. The proposed bound on the generalisation gap depends on the mixing coefficient of the data distribution, and the essential supremum of the data. Furthermore, the bound converges to zero as the dataset size increases. In this paper, we 1) formalize the learning problem, 2) derive a PAC-Bayesian error bound for such systems, 3) discuss various consequences of this error bound, and 4) show an illustrative example, with discussions on computing the proposed bound. Unlike other available bounds the derived bound holds for non i.i.d. data (time-series) and it does not grow with the number of steps of the RNN.

* Accepted to AAAI2024 conference

Via

Access Paper or Ask Questions

PAC-Bayesian-Like Error Bound for a Class of Linear Time-Invariant Stochastic State-Space Models

Dec 30, 2022

Deividas Eringis, John Leth, Zheng-Hua Tan, Rafal Wisniewski, Mihaly Petreczky

Abstract:In this paper we derive a PAC-Bayesian-Like error bound for a class of stochastic dynamical systems with inputs, namely, for linear time-invariant stochastic state-space models (stochastic LTI systems for short). This class of systems is widely used in control engineering and econometrics, in particular, they represent a special case of recurrent neural networks. In this paper we 1) formalize the learning problem for stochastic LTI systems with inputs, 2) derive a PAC-Bayesian-Like error bound for such systems, 3) discuss various consequences of this error bound.

Via

Access Paper or Ask Questions

Privacy-Preserving Distributed Expectation Maximization for Gaussian Mixture Model using Subspace Perturbation

Sep 16, 2022

Qiongxiu Li, Jaron Skovsted Gundersen, Katrine Tjell, Rafal Wisniewski, Mads Græsbøll Christensen

Figure 1 for Privacy-Preserving Distributed Expectation Maximization for Gaussian Mixture Model using Subspace Perturbation

Figure 2 for Privacy-Preserving Distributed Expectation Maximization for Gaussian Mixture Model using Subspace Perturbation

Figure 3 for Privacy-Preserving Distributed Expectation Maximization for Gaussian Mixture Model using Subspace Perturbation

Abstract:Privacy has become a major concern in machine learning. In fact, the federated learning is motivated by the privacy concern as it does not allow to transmit the private data but only intermediate updates. However, federated learning does not always guarantee privacy-preservation as the intermediate updates may also reveal sensitive information. In this paper, we give an explicit information-theoretical analysis of a federated expectation maximization algorithm for Gaussian mixture model and prove that the intermediate updates can cause severe privacy leakage. To address the privacy issue, we propose a fully decentralized privacy-preserving solution, which is able to securely compute the updates in each maximization step. Additionally, we consider two different types of security attacks: the honest-but-curious and eavesdropping adversary models. Numerical validation shows that the proposed approach has superior performance compared to the existing approach in terms of both the accuracy and privacy level.

* ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022, pp. 4263-4267

Via

Access Paper or Ask Questions

Optimal Prediction of Unmeasured Output from Measurable Outputs In LTI Systems

Sep 06, 2021

Deividas Eringis, John Leth, Zheng-Hua Tan, Rafal Wisniewski, Mihaly Petreczky

Abstract:In this short article, we showcase the derivation of an optimal predictor, when one part of system's output is not measured but is able to be predicted from the rest of the system's output which is measured. According to author's knowledge, similar derivations have been done before but not in state-space representation.

Via

Access Paper or Ask Questions

PAC-Bayesian theory for stochastic LTI systems

Mar 25, 2021

Deividas Eringis, John Leth, Zheng-Hua Tan, Rafal Wisniewski, Alireza Fakhrizadeh Esfahani, Mihaly Petreczky

Figure 1 for PAC-Bayesian theory for stochastic LTI systems

Figure 2 for PAC-Bayesian theory for stochastic LTI systems

Figure 3 for PAC-Bayesian theory for stochastic LTI systems

Abstract:In this paper we derive a PAC-Bayesian error bound for autonomous stochastic LTI state-space models. The motivation for deriving such error bounds is that they will allow deriving similar error bounds for more general dynamical systems, including recurrent neural networks. In turn, PACBayesian error bounds are known to be useful for analyzing machine learning algorithms and for deriving new ones.

Via

Access Paper or Ask Questions