Abstract:The hippocampus has been associated with both spatial cognition and episodic memory formation, but integrating these functions into a unified framework remains challenging. Here, we demonstrate that forming discrete memories of visual events in sparse autoencoder neurons can produce spatial tuning similar to hippocampal place cells. We then show that the resulting very high-dimensional code enables neurons to discretize and tile the underlying image space with minimal overlap. Additionally, we extend our results to the auditory domain, showing that neurons similarly tile the frequency space in an experience-dependent manner. Lastly, we show that reinforcement learning agents can effectively perform various visuo-spatial cognitive tasks using these sparse, very high-dimensional representations.
Abstract:State of the art deep reinforcement learning algorithms are sample inefficient due to the large number of episodes they require to achieve asymptotic performance. Episodic Reinforcement Learning (ERL) algorithms, inspired by the mammalian hippocampus, typically use extended memory systems to bootstrap learning from past events to overcome this sample-inefficiency problem. However, such memory augmentations are often used as mere buffers, from which isolated past experiences are drawn to learn from in an offline fashion (e.g., replay). Here, we demonstrate that including a bias in the acquired memory content derived from the order of episodic sampling improves both the sample and memory efficiency of an episodic control algorithm. We test our Sequential Episodic Control (SEC) model in a foraging task to show that storing and using integrated episodes as event sequences leads to faster learning with fewer memory requirements as opposed to a standard ERL benchmark, Model-Free Episodic Control, that buffers isolated events only. We also study the effect of memory constraints and forgetting on the sequential and non-sequential version of the SEC algorithm. Furthermore, we discuss how a hippocampal-like fast memory system could bootstrap slow cortical and subcortical learning subserving habit formation in the mammalian brain.
Abstract:The sample-inefficiency problem in Artificial Intelligence refers to the inability of current Deep Reinforcement Learning models to optimize action policies within a small number of episodes. Recent studies have tried to overcome this limitation by adding memory systems and architectural biases to improve learning speed, such as in Episodic Reinforcement Learning. However, despite achieving incremental improvements, their performance is still not comparable to how humans learn behavioral policies. In this paper, we capitalize on the design principles of the Distributed Adaptive Control (DAC) theory of mind and brain to build a novel cognitive architecture (DAC-ML) that, by incorporating a hippocampus-inspired sequential memory system, can rapidly converge to effective action policies that maximize reward acquisition in a challenging foraging task.
Abstract:This paper introduces a cognitive architecture for a humanoid robot to engage in a proactive, mixed-initiative exploration and manipulation of its environment, where the initiative can originate from both the human and the robot. The framework, based on a biologically-grounded theory of the brain and mind, integrates a reactive interaction engine, a number of state-of-the-art perceptual and motor learning algorithms, as well as planning abilities and an autobiographical memory. The architecture as a whole drives the robot behavior to solve the symbol grounding problem, acquire language capabilities, execute goal-oriented behavior, and express a verbal narrative of its own experience in the world. We validate our approach in human-robot interaction experiments with the iCub humanoid robot, showing that the proposed cognitive architecture can be applied in real time within a realistic scenario and that it can be used with naive users.
Abstract:In this paper, we argue that the future of Artificial Intelligence research resides in two keywords: integration and embodiment. We support this claim by analyzing the recent advances of the field. Regarding integration, we note that the most impactful recent contributions have been made possible through the integration of recent Machine Learning methods (based in particular on Deep Learning and Recurrent Neural Networks) with more traditional ones (e.g. Monte-Carlo tree search, goal babbling exploration or addressable memory systems). Regarding embodiment, we note that the traditional benchmark tasks (e.g. visual classification or board games) are becoming obsolete as state-of-the-art learning algorithms approach or even surpass human performance in most of them, having recently encouraged the development of first-person 3D game platforms embedding realistic physics. Building upon this analysis, we first propose an embodied cognitive architecture integrating heterogenous sub-fields of Artificial Intelligence into a unified framework. We demonstrate the utility of our approach by showing how major contributions of the field can be expressed within the proposed framework. We then claim that benchmarking environments need to reproduce ecologically-valid conditions for bootstrapping the acquisition of increasingly complex cognitive skills through the concept of a cognitive arms race between embodied agents.
Abstract:What does the informational complexity of dynamical networked systems tell us about intrinsic mechanisms and functions of these complex systems? Recent complexity measures such as integrated information have sought to operationalize this problem taking a whole-versus-parts perspective, wherein one explicitly computes the amount of information generated by a network as a whole over and above that generated by the sum of its parts during state transitions. While several numerical schemes for estimating network integrated information exist, it is instructive to pursue an analytic approach that computes integrated information as a function of network weights. Our formulation of integrated information uses a Kullback-Leibler divergence between the multi-variate distribution on the set of network states versus the corresponding factorized distribution over its parts. Implementing stochastic Gaussian dynamics, we perform computations for several prototypical network topologies. Our findings show increased informational complexity near criticality, which remains consistent across network topologies. Spectral decomposition of the system's dynamics reveals how informational complexity is governed by eigenmodes of both, the network's covariance and adjacency matrices. We find that as the dynamics of the system approach criticality, high integrated information is exclusively driven by the eigenmode corresponding to the leading eigenvalue of the covariance matrix, while sub-leading modes get suppressed. The implication of this result is that it might be favorable for complex dynamical networked systems such as the human brain or communication systems to operate near criticality so that efficient information integration might be achieved.