Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Mikhail Prokopenko

Bounded rationality for relaxing best response and mutual consistency: An information-theoretic model of partial self-reference

Jun 30, 2021

Benjamin Patrick Evans, Mikhail Prokopenko

Figure 1 for Bounded rationality for relaxing best response and mutual consistency: An information-theoretic model of partial self-reference

Figure 2 for Bounded rationality for relaxing best response and mutual consistency: An information-theoretic model of partial self-reference

Figure 3 for Bounded rationality for relaxing best response and mutual consistency: An information-theoretic model of partial self-reference

Figure 4 for Bounded rationality for relaxing best response and mutual consistency: An information-theoretic model of partial self-reference

Abstract:While game theory has been transformative for decision-making, the assumptions made can be overly restrictive in certain instances. In this work, we focus on some of the assumptions underlying rationality such as mutual consistency and best-response, and consider ways to relax these assumptions using concepts from level-$k$ reasoning and quantal response equilibrium (QRE) respectively. Specifically, we provide an information-theoretic two-parameter model that can relax both mutual consistency and best-response, but can recover approximations of level-$k$, QRE, or typical Nash equilibrium behaviour in the limiting cases. The proposed approach is based on a recursive form of the variational free energy principle, representing self-referential games as (pseudo) sequential decisions. Bounds in player processing abilities are captured as information costs, where future chains of reasoning are discounted, implying a hierarchy of players where lower-level players have fewer processing resources.

* 35 pages, 15 figures

Via

Access Paper or Ask Questions

A maximum entropy model of bounded rational decision-making with prior beliefs and market feedback

Feb 18, 2021

Benjamin Patrick Evans, Mikhail Prokopenko

Figure 1 for A maximum entropy model of bounded rational decision-making with prior beliefs and market feedback

Figure 2 for A maximum entropy model of bounded rational decision-making with prior beliefs and market feedback

Figure 3 for A maximum entropy model of bounded rational decision-making with prior beliefs and market feedback

Figure 4 for A maximum entropy model of bounded rational decision-making with prior beliefs and market feedback

Abstract:Bounded rationality is an important consideration stemming from the fact that agents often have limits on their processing abilities, making the assumption of perfect rationality inapplicable to many real tasks. We propose an information-theoretic approach to the inference of agent decisions under Smithian competition. The model explicitly captures the boundedness of agents (limited in their information-processing capacity) as the cost of information acquisition for expanding their prior beliefs. The expansion is measured as the Kullblack-Leibler divergence between posterior decisions and prior beliefs. When information acquisition is free, the \textit{homo economicus} agent is recovered, while in cases when information acquisition becomes costly, agents instead revert to their prior beliefs. The maximum entropy principle is used to infer least-biased decisions, based upon the notion of Smithian competition formalised within the Quantal Response Statistical Equilibrium framework. The incorporation of prior beliefs into such a framework allowed us to systematically explore the effects of prior beliefs on decision-making, in the presence of market feedback. We verified the proposed model using Australian housing market data, showing how the incorporation of prior knowledge alters the resulting agent decisions. Specifically, it allowed for the separation (and analysis) of past beliefs and utility maximisation behaviour of the agent.

* 32 pages

Via

Access Paper or Ask Questions

Fractals2019: Combinatorial Optimisation with Dynamic Constraint Annealing

Oct 14, 2019

Mikhail Prokopenko, Peter Wang

Figure 1 for Fractals2019: Combinatorial Optimisation with Dynamic Constraint Annealing

Figure 2 for Fractals2019: Combinatorial Optimisation with Dynamic Constraint Annealing

Figure 3 for Fractals2019: Combinatorial Optimisation with Dynamic Constraint Annealing

Figure 4 for Fractals2019: Combinatorial Optimisation with Dynamic Constraint Annealing

Abstract:Fractals2019 started as a new experimental entry in the RoboCup Soccer 2D Simulation League, based on Gliders2d code base, and advanced to become a RoboCup-2019 champion. We employ combinatorial optimisation methods, within the framework of Guided Self-Organisation, with the search guided by local constraints. We present examples of several tactical tasks based on the Gliders2d code (version v2), including the search for an optimal assignment of heterogeneous player types, as well as blocking behaviours, offside trap, and attacking formations. We propose a new method, Dynamic Constraint Annealing, for solving dynamic constraint satisfaction problems, and apply it to optimise thermodynamic potential of collective behaviours, under dynamically induced constraints.

* 12 pages, 1 figure, RoboCup-2019, champion paper

Via

Access Paper or Ask Questions

An Information Criterion for Inferring Coupling in Distributed Dynamical Systems

Nov 11, 2016

Oliver M. Cliff, Mikhail Prokopenko, Robert Fitch

Figure 1 for An Information Criterion for Inferring Coupling in Distributed Dynamical Systems

Abstract:The behaviour of many real-world phenomena can be modelled by nonlinear dynamical systems whereby a latent system state is observed through a filter. We are interested in interacting subsystems of this form, which we model by a set of coupled maps as a synchronous update graph dynamical systems. Specifically, we study the structure learning problem for spatially distributed dynamical systems coupled via a directed acyclic graph. Unlike established structure learning procedures that find locally maximum posterior probabilities of a network structure containing latent variables, our work exploits the properties of dynamical systems to compute globally optimal approximations of these distributions. We arrive at this result by the use of time delay embedding theorems. Taking an information-theoretic perspective, we show that the log-likelihood has an intuitive interpretation in terms of information transfer.

* Front. Robot. AI 3(71), 2016

Via

Access Paper or Ask Questions

Inferring Coupling of Distributed Dynamical Systems via Transfer Entropy

Nov 02, 2016

Oliver M. Cliff, Mikhail Prokopenko, Robert Fitch

Figure 1 for Inferring Coupling of Distributed Dynamical Systems via Transfer Entropy

Abstract:In this work, we are interested in structure learning for a set of spatially distributed dynamical systems, where individual subsystems are coupled via latent variables and observed through a filter. We represent this model as a directed acyclic graph (DAG) that characterises the unidirectional coupling between subsystems. Standard approaches to structure learning are not applicable in this framework due to the hidden variables, however we can exploit the properties of certain dynamical systems to formulate exact methods based on state space reconstruction. We approach the problem by using reconstruction theorems to analytically derive a tractable expression for the KL-divergence of a candidate DAG from the observed dataset. We show this measure can be decomposed as a function of two information-theoretic measures, transfer entropy and stochastic interaction. We then present two mathematically robust scoring functions based on transfer entropy and statistical independence tests. These results support the previously held conjecture that transfer entropy can be used to infer effective connectivity in complex networks.

Via

Access Paper or Ask Questions

Simulation leagues: Enabling replicable and robust investigation of complex robotic systems

Dec 18, 2014

David M Budden, Peter Wang, Oliver Obst, Mikhail Prokopenko

Figure 1 for Simulation leagues: Enabling replicable and robust investigation of complex robotic systems

Figure 2 for Simulation leagues: Enabling replicable and robust investigation of complex robotic systems

Figure 3 for Simulation leagues: Enabling replicable and robust investigation of complex robotic systems

Figure 4 for Simulation leagues: Enabling replicable and robust investigation of complex robotic systems

Abstract:Physically-realistic simulated environments are powerful platforms for enabling measurable, replicable and statistically-robust investigation of complex robotic systems. Such environments are epitomised by the RoboCup simulation leagues, which have been successfully utilised to conduct massively-parallel experiments in topics including: optimisation of bipedal locomotion, self-localisation from noisy perception data and planning complex multi-agent strategies without direct agent-to-agent communication. Many of these systems are later transferred to physical robots, making the simulation leagues invaluable well-beyond the scope of simulated soccer matches. In this study, we provide an overview of the RoboCup simulation leagues and describe their properties as they pertain to replicable and robust robotics research. To demonstrate their utility directly, we leverage the ability to run parallelised experiments to evaluate different competition formats (e.g. round robin) for the RoboCup 2D simulation league. Our results demonstrate that a previously-proposed hybrid format minimises fluctuations from 'true' (statistically-significant) team performance rankings within the time constraints of the RoboCup world finals. Our experimental analysis would be impossible with physical robots alone, and we encourage other researchers to explore the potential for enriching their experimental pipelines with simulated components, both to minimise experimental costsand enable others to replicate and expand upon their results in a hardware-independent manner.

* 9 pages, 4 figures. arXiv admin note: text overlap with arXiv:1403.4023

Via

Access Paper or Ask Questions

Information Transfer in Swarms with Leaders

Jun 30, 2014

Yu Sun, Louis F. Rossi, Chien-Chung Shen, Jennifer Miller, X. Rosalind Wang, Joseph T. Lizier, Mikhail Prokopenko, Upul Senanayake

Figure 1 for Information Transfer in Swarms with Leaders

Figure 2 for Information Transfer in Swarms with Leaders

Abstract:Swarm dynamics is the study of collections of agents that interact with one another without central control. In natural systems, insects, birds, fish and other large mammals function in larger units to increase the overall fitness of the individuals. Their behavior is coordinated through local interactions to enhance mate selection, predator detection, migratory route identification and so forth [Andersson and Wallander 2003; Buhl et al. 2006; Nagy et al. 2010; Partridge 1982; Sumpter et al. 2008]. In artificial systems, swarms of autonomous agents can augment human activities such as search and rescue, and environmental monitoring by covering large areas with multiple nodes [Alami et al. 2007; Caruso et al. 2008; Ogren et al. 2004; Paley et al. 2007; Sibley et al. 2002]. In this paper, we explore the interplay between swarm dynamics, covert leadership and theoretical information transfer. A leader is a member of the swarm that acts upon information in addition to what is provided by local interactions. Depending upon the leadership model, leaders can use their external information either all the time or in response to local conditions [Couzin et al. 2005; Sun et al. 2013]. A covert leader is a leader that is treated no differently than others in the swarm, so leaders and followers participate equally in whatever interaction model is used [Rossi et al. 2007]. In this study, we use theoretical information transfer as a means of analyzing swarm interactions to explore whether or not it is possible to distinguish between followers and leaders based on interactions within the swarm. We find that covert leaders can be distinguished from followers in a swarm because they receive less transfer entropy than followers.

Via

Access Paper or Ask Questions

Simulation leagues: Analysis of competition formats

Jun 26, 2014

David Budden, Peter Wang, Oliver Obst, Mikhail Prokopenko

Figure 1 for Simulation leagues: Analysis of competition formats

Figure 2 for Simulation leagues: Analysis of competition formats

Figure 3 for Simulation leagues: Analysis of competition formats

Figure 4 for Simulation leagues: Analysis of competition formats

Abstract:The selection of an appropriate competition format is critical for both the success and credibility of any competition, both real and simulated. In this paper, the automated parallelism offered by the RoboCupSoccer 2D simulation league is leveraged to conduct a 28,000 game round-robin between the top 8 teams from RoboCup 2012 and 2013. A proposed new competition format is found to reduce variation from the resultant statistically significant team performance rankings by 75% and 67%, when compared to the actual competition results from RoboCup 2012 and 2013 respectively. These results are statistically validated by generating 10,000 random tournaments for each of the three considered formats and comparing the respective distributions of ranking discrepancy.

* 12 pages, 2 figures, presented at RoboCup 2014 symposium, Brazil

Via

Access Paper or Ask Questions

Gliders2012: Development and Competition Results

Nov 21, 2012

Edward Moore, Oliver Obst, Mikhail Prokopenko, Peter Wang, Jason Held

Figure 1 for Gliders2012: Development and Competition Results

Figure 2 for Gliders2012: Development and Competition Results

Figure 3 for Gliders2012: Development and Competition Results

Abstract:The RoboCup 2D Simulation League incorporates several challenging features, setting a benchmark for Artificial Intelligence (AI). In this paper we describe some of the ideas and tools around the development of our team, Gliders2012. In our description, we focus on the evaluation function as one of our central mechanisms for action selection. We also point to a new framework for watching log files in a web browser that we release for use and further development by the RoboCup community. Finally, we also summarize results of the group and final matches we played during RoboCup 2012, with Gliders2012 finishing 4th out of 19 teams.

* 10 pages

Via

Access Paper or Ask Questions

Complex Networks

Apr 29, 2011

Carlos Gershenson, Mikhail Prokopenko

Abstract:Introduction to the Special Issue on Complex Networks, Artificial Life journal.

* Artificial Life 17(4):259--261. 2011
* 7 pages, in press

Via

Access Paper or Ask Questions