Abstract:While game theory has been transformative for decision-making, the assumptions made can be overly restrictive in certain instances. In this work, we focus on some of the assumptions underlying rationality such as mutual consistency and best-response, and consider ways to relax these assumptions using concepts from level-$k$ reasoning and quantal response equilibrium (QRE) respectively. Specifically, we provide an information-theoretic two-parameter model that can relax both mutual consistency and best-response, but can recover approximations of level-$k$, QRE, or typical Nash equilibrium behaviour in the limiting cases. The proposed approach is based on a recursive form of the variational free energy principle, representing self-referential games as (pseudo) sequential decisions. Bounds in player processing abilities are captured as information costs, where future chains of reasoning are discounted, implying a hierarchy of players where lower-level players have fewer processing resources.
Abstract:Bounded rationality is an important consideration stemming from the fact that agents often have limits on their processing abilities, making the assumption of perfect rationality inapplicable to many real tasks. We propose an information-theoretic approach to the inference of agent decisions under Smithian competition. The model explicitly captures the boundedness of agents (limited in their information-processing capacity) as the cost of information acquisition for expanding their prior beliefs. The expansion is measured as the Kullblack-Leibler divergence between posterior decisions and prior beliefs. When information acquisition is free, the \textit{homo economicus} agent is recovered, while in cases when information acquisition becomes costly, agents instead revert to their prior beliefs. The maximum entropy principle is used to infer least-biased decisions, based upon the notion of Smithian competition formalised within the Quantal Response Statistical Equilibrium framework. The incorporation of prior beliefs into such a framework allowed us to systematically explore the effects of prior beliefs on decision-making, in the presence of market feedback. We verified the proposed model using Australian housing market data, showing how the incorporation of prior knowledge alters the resulting agent decisions. Specifically, it allowed for the separation (and analysis) of past beliefs and utility maximisation behaviour of the agent.
Abstract:Fractals2019 started as a new experimental entry in the RoboCup Soccer 2D Simulation League, based on Gliders2d code base, and advanced to become a RoboCup-2019 champion. We employ combinatorial optimisation methods, within the framework of Guided Self-Organisation, with the search guided by local constraints. We present examples of several tactical tasks based on the Gliders2d code (version v2), including the search for an optimal assignment of heterogeneous player types, as well as blocking behaviours, offside trap, and attacking formations. We propose a new method, Dynamic Constraint Annealing, for solving dynamic constraint satisfaction problems, and apply it to optimise thermodynamic potential of collective behaviours, under dynamically induced constraints.
Abstract:The behaviour of many real-world phenomena can be modelled by nonlinear dynamical systems whereby a latent system state is observed through a filter. We are interested in interacting subsystems of this form, which we model by a set of coupled maps as a synchronous update graph dynamical systems. Specifically, we study the structure learning problem for spatially distributed dynamical systems coupled via a directed acyclic graph. Unlike established structure learning procedures that find locally maximum posterior probabilities of a network structure containing latent variables, our work exploits the properties of dynamical systems to compute globally optimal approximations of these distributions. We arrive at this result by the use of time delay embedding theorems. Taking an information-theoretic perspective, we show that the log-likelihood has an intuitive interpretation in terms of information transfer.
Abstract:In this work, we are interested in structure learning for a set of spatially distributed dynamical systems, where individual subsystems are coupled via latent variables and observed through a filter. We represent this model as a directed acyclic graph (DAG) that characterises the unidirectional coupling between subsystems. Standard approaches to structure learning are not applicable in this framework due to the hidden variables, however we can exploit the properties of certain dynamical systems to formulate exact methods based on state space reconstruction. We approach the problem by using reconstruction theorems to analytically derive a tractable expression for the KL-divergence of a candidate DAG from the observed dataset. We show this measure can be decomposed as a function of two information-theoretic measures, transfer entropy and stochastic interaction. We then present two mathematically robust scoring functions based on transfer entropy and statistical independence tests. These results support the previously held conjecture that transfer entropy can be used to infer effective connectivity in complex networks.
Abstract:Physically-realistic simulated environments are powerful platforms for enabling measurable, replicable and statistically-robust investigation of complex robotic systems. Such environments are epitomised by the RoboCup simulation leagues, which have been successfully utilised to conduct massively-parallel experiments in topics including: optimisation of bipedal locomotion, self-localisation from noisy perception data and planning complex multi-agent strategies without direct agent-to-agent communication. Many of these systems are later transferred to physical robots, making the simulation leagues invaluable well-beyond the scope of simulated soccer matches. In this study, we provide an overview of the RoboCup simulation leagues and describe their properties as they pertain to replicable and robust robotics research. To demonstrate their utility directly, we leverage the ability to run parallelised experiments to evaluate different competition formats (e.g. round robin) for the RoboCup 2D simulation league. Our results demonstrate that a previously-proposed hybrid format minimises fluctuations from 'true' (statistically-significant) team performance rankings within the time constraints of the RoboCup world finals. Our experimental analysis would be impossible with physical robots alone, and we encourage other researchers to explore the potential for enriching their experimental pipelines with simulated components, both to minimise experimental costsand enable others to replicate and expand upon their results in a hardware-independent manner.
Abstract:Swarm dynamics is the study of collections of agents that interact with one another without central control. In natural systems, insects, birds, fish and other large mammals function in larger units to increase the overall fitness of the individuals. Their behavior is coordinated through local interactions to enhance mate selection, predator detection, migratory route identification and so forth [Andersson and Wallander 2003; Buhl et al. 2006; Nagy et al. 2010; Partridge 1982; Sumpter et al. 2008]. In artificial systems, swarms of autonomous agents can augment human activities such as search and rescue, and environmental monitoring by covering large areas with multiple nodes [Alami et al. 2007; Caruso et al. 2008; Ogren et al. 2004; Paley et al. 2007; Sibley et al. 2002]. In this paper, we explore the interplay between swarm dynamics, covert leadership and theoretical information transfer. A leader is a member of the swarm that acts upon information in addition to what is provided by local interactions. Depending upon the leadership model, leaders can use their external information either all the time or in response to local conditions [Couzin et al. 2005; Sun et al. 2013]. A covert leader is a leader that is treated no differently than others in the swarm, so leaders and followers participate equally in whatever interaction model is used [Rossi et al. 2007]. In this study, we use theoretical information transfer as a means of analyzing swarm interactions to explore whether or not it is possible to distinguish between followers and leaders based on interactions within the swarm. We find that covert leaders can be distinguished from followers in a swarm because they receive less transfer entropy than followers.
Abstract:The selection of an appropriate competition format is critical for both the success and credibility of any competition, both real and simulated. In this paper, the automated parallelism offered by the RoboCupSoccer 2D simulation league is leveraged to conduct a 28,000 game round-robin between the top 8 teams from RoboCup 2012 and 2013. A proposed new competition format is found to reduce variation from the resultant statistically significant team performance rankings by 75% and 67%, when compared to the actual competition results from RoboCup 2012 and 2013 respectively. These results are statistically validated by generating 10,000 random tournaments for each of the three considered formats and comparing the respective distributions of ranking discrepancy.
Abstract:The RoboCup 2D Simulation League incorporates several challenging features, setting a benchmark for Artificial Intelligence (AI). In this paper we describe some of the ideas and tools around the development of our team, Gliders2012. In our description, we focus on the evaluation function as one of our central mechanisms for action selection. We also point to a new framework for watching log files in a web browser that we release for use and further development by the RoboCup community. Finally, we also summarize results of the group and final matches we played during RoboCup 2012, with Gliders2012 finishing 4th out of 19 teams.
Abstract:Introduction to the Special Issue on Complex Networks, Artificial Life journal.