Abstract:Despite the growing interest in robot control utilizing the computation of biological neurons, context-dependent behavior by neuron-connected robots remains a challenge. Context-dependent behavior here is defined as behavior that is not the result of a simple sensory-motor coupling, but rather based on an understanding of the task goal. This paper proposes design principles for training neuron-connected robots based on task goals to achieve context-dependent behavior. First, we employ deep reinforcement learning (RL) to enable training that accounts for goal achievements. Second, we propose a neuron simulator as a probability distribution based on recorded neural data, aiming to represent physiologically valid neural dynamics while avoiding complex modeling with high computational costs. Furthermore, we propose to update the simulators during the training to bridge the gap between the simulation and the real settings. The experiments showed that the robot gradually learned context-dependent behaviors in pole balancing and robot navigation tasks. Moreover, the learned policies were valid for neural simulators based on novel neural data, and the task performance increased by updating the simulators during training. These results suggest the effectiveness of the proposed design principle for the context-dependent behavior of neuron-connected robots.
Abstract:Living organisms must actively maintain themselves in order to continue existing. Autopoiesis is a key concept in the study of living organisms, where the boundaries of the organism is not static by dynamically regulated by the system itself. To study the autonomous regulation of self-boundary, we focus on neural homeodynamic responses to environmental changes using both biological and artificial neural networks. Previous studies showed that embodied cultured neural networks and spiking neural networks with spike-timing dependent plasticity (STDP) learn an action as they avoid stimulation from outside. In this paper, as a result of our experiments using embodied cultured neurons, we find that there is also a second property allowing the network to avoid stimulation: if the agent cannot learn an action to avoid the external stimuli, it tends to decrease the stimulus-evoked spikes, as if to ignore the uncontrollable-input. We also show such a behavior is reproduced by spiking neural networks with asymmetric STDP. We consider that these properties are regarded as autonomous regulation of self and non-self for the network, in which a controllable-neuron is regarded as self, and an uncontrollable-neuron is regarded as non-self. Finally, we introduce neural autopoiesis by proposing the principle of stimulus avoidance.
Abstract:The emulation task of a nonlinear autoregressive moving average model, i.e., the NARMA10 task, has been widely used as a benchmark task for recurrent neural networks, especially in reservoir computing. However, the type and quantity of computational capabilities required to emulate the NARMA10 model remain unclear, and, to date, the NARMA10 task has been utilized blindly. Therefore, in this study, we have investigated the properties of the NARMA10 model from a dynamical system perspective. We revealed its bifurcation structure and basin of attraction, as well as the system's Lyapunov spectra. Furthermore, we have analyzed the computational capabilities required to emulate the NARMA10 model by decomposing it into multiple combinations of orthogonal nonlinear polynomials using Legendre polynomials, and we directly evaluated its information processing capacity together with its dependences on some system parameters. The result demonstrates that the NARMA10 model contains an unstable region in the phase space that makes the system diverge according to the selection of the input range and initial conditions. Furthermore, the information processing capacity of the model varies according to the input range. These properties prevent safe application of this model and fair comparisons among experiments, which are unfavorable for a benchmark task. As a result, we propose a benchmark model that can clearly evaluate equivalent computational capacity using NARMA10. Compared to the original NARMA10 model, the proposed model is highly stable and robust against the input range settings.