Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Deepanshu Vasal

Convergence of Generalized Belief Propagation Algorithm on Graphs with Motifs

Dec 11, 2021

Yitao Chen, Deepanshu Vasal

Figure 1 for Convergence of Generalized Belief Propagation Algorithm on Graphs with Motifs

Abstract:Belief propagation is a fundamental message-passing algorithm for numerous applications in machine learning. It is known that belief propagation algorithm is exact on tree graphs. However, belief propagation is run on loopy graphs in most applications. So, understanding the behavior of belief propagation on loopy graphs has been a major topic for researchers in different areas. In this paper, we study the convergence behavior of generalized belief propagation algorithm on graphs with motifs (triangles, loops, etc.) We show under a certain initialization, generalized belief propagation converges to the global optimum of the Bethe free energy for ferromagnetic Ising models on graphs with motifs.

* 10 pages 2 figures

Via

Access Paper or Ask Questions

Multi-Agent Decentralized Belief Propagation on Graphs

Nov 10, 2020

Yitao Chen, Deepanshu Vasal

Abstract:We consider the problem of interactive partially observable Markov decision processes (I-POMDPs), where the agents are located at the nodes of a communication network. Specifically, we assume a certain message type for all messages. Moreover, each agent makes individual decisions based on the interactive belief states, the information observed locally and the messages received from its neighbors over the network. Within this setting, the collective goal of the agents is to maximize the globally averaged return over the network through exchanging information with their neighbors. We propose a decentralized belief propagation algorithm for the problem, and prove the convergence of our algorithm. Finally we show multiple applications of our framework. Our work appears to be the first study of decentralized belief propagation algorithm for networked multi-agent I-POMDPs.

* 16 pages. arXiv admin note: text overlap with arXiv:1109.2135, arXiv:1209.1695, arXiv:1802.08757 by other authors

Via

Access Paper or Ask Questions

Model-free Reinforcement Learning for Stochastic Stackelberg Security Games

May 24, 2020

Deepanshu Vasal

Figure 1 for Model-free Reinforcement Learning for Stochastic Stackelberg Security Games

Figure 2 for Model-free Reinforcement Learning for Stochastic Stackelberg Security Games

Figure 3 for Model-free Reinforcement Learning for Stochastic Stackelberg Security Games

Figure 4 for Model-free Reinforcement Learning for Stochastic Stackelberg Security Games

Abstract:In this paper, we consider a sequential stochastic Stackelberg game with two players, a leader and a follower. The follower has access to the state of the system while the leader does not. Assuming that the players act in their respective best interests, the follower's strategy is to play the best response to the leader's strategy. In such a scenario, the leader has the advantage of committing to a policy which maximizes its own returns given the knowledge that the follower is going to play the best response to its policy. Thus, both players converge to a pair of policies that form the Stackelberg equilibrium of the game. Recently,~[1] provided a sequential decomposition algorithm to compute the Stackelberg equilibrium for such games which allow for the computation of Markovian equilibrium policies in linear time as opposed to double exponential, as before. In this paper, we extend the idea to an MDP whose dynamics are not known to the players, to propose an RL algorithm based on Expected Sarsa that learns the Stackelberg equilibrium policy by simulating a model of the MDP. We use particle filters to estimate the belief update for a common agent which computes the optimal policy based on the information which is common to both the players. We present a security game example to illustrate the policy learned by our algorithm. by simulating a model of the MDP. We use particle filters to estimate the belief update for a common agent which computes the optimal policy based on the information which is common to both the players. We present a security game example to illustrate the policy learned by our algorithm.

Via

Access Paper or Ask Questions