Abstract:To handle unintended changes in the environment by agents, we propose an environment-centric active inference EC-AIF in which the Markov Blanket of active inference is defined starting from the environment. In normal active inference, the Markov Blanket is defined starting from the agent. That is, first the agent was defined as the entity that performs the "action" such as a robot or a person, then the environment was defined as other people or objects that are directly affected by the agent's "action," and the boundary between the agent and the environment was defined as the Markov Blanket. This agent-centric definition does not allow the agent to respond to unintended changes in the environment caused by factors outside of the defined environment. In the proposed EC-AIF, there is no entity corresponding to an agent. The environment includes all observable things, including people and things conventionally considered to be the environment, as well as entities that perform "actions" such as robots and people. Accordingly, all states, including robots and people, are included in inference targets, eliminating unintended changes in the environment. The EC-AIF was applied to a robot arm and validated with an object transport task by the robot arm. The results showed that the robot arm successfully transported objects while responding to changes in the target position of the object and to changes in the orientation of another robot arm.
Abstract:Self-report measures (e.g., Likert scales) are widely used to evaluate subjective health perceptions. Recently, the visual analog scale (VAS), a slider-based scale, has become popular owing to its ability to precisely and easily assess how people feel. These data can be influenced by the response style (RS), a user-dependent systematic tendency that occurs regardless of questionnaire instructions. Despite its importance, especially in between-individual analysis, little attention has been paid to handling the RS in the VAS (denoted as response profile (RP)), as it is mainly used for within-individual monitoring and is less affected by RP. However, VAS measurements often require repeated self-reports of the same questionnaire items, making it difficult to apply conventional methods on a Likert scale. In this study, we developed a novel RP characterization method for various types of repeatedly measured VAS data. This approach involves the modeling of RP as distributional parameters ${\theta}$ through a mixture of RS-like distributions, and addressing the issue of unbalanced data through bootstrap sampling for treating repeated measures. We assessed the effectiveness of the proposed method using simulated pseudo-data and an actual dataset from an empirical study. The assessment of parameter recovery showed that our method accurately estimated the RP parameter ${\theta}$, demonstrating its robustness. Moreover, applying our method to an actual VAS dataset revealed the presence of individual RP heterogeneity, even in repeated VAS measurements, similar to the findings of the Likert scale. Our proposed method enables RP heterogeneity-aware VAS data analysis, similar to Likert-scale data analysis.
Abstract:Wealth distribution is a complex and critical aspect of any society. Information exchange is considered to have played a role in shaping wealth distribution patterns, but the specific dynamic mechanism is still unclear. In this research, we used simulation-based methods to investigate the impact of different modes of information exchange on wealth distribution. We compared different combinations of information exchange strategies and moving strategies, analyzed their impact on wealth distribution using classic wealth distribution indicators such as the Gini coefficient. Our findings suggest that information exchange strategies have significant impact on wealth distribution and that promoting more equitable access to information and resources is crucial in building a just and equitable society for all.
Abstract:We present Mask Atari, a new benchmark to help solve partially observable Markov decision process (POMDP) problems with Deep Reinforcement Learning (DRL)-based approaches. To achieve a simulation environment for the POMDP problems, Mask Atari is constructed based on Atari 2600 games with controllable, moveable, and learnable masks as the observation area for the target agent, especially with the active information gathering (AIG) setting in POMDPs. Given that one does not yet exist, Mask Atari provides a challenging, efficient benchmark for evaluating the methods that focus on the above problem. Moreover, the mask operation is a trial for introducing the receptive field in the human vision system into a simulation environment for an agent, which means the evaluations are not biased from the sensing ability and purely focus on the cognitive performance of the methods when compared with the human baseline. We describe the challenges and features of our benchmark and evaluate several baselines with Mask Atari.
Abstract:We propose an embodied system based on the free energy principle (FEP) for sensorimotor visual perception. We evaluated it in a character-recognition task using the MNIST dataset. Although the FEP has successfully described a rule that living things obey mathematically and claims that a biological system continues to change its internal models and behaviors to minimize the difference in predicting sensory input, it is not enough to model sensorimotor visual perception. An embodiment of the system is the key to achieving sensorimotor visual perception. The proposed embodied system is configured by a body and memory. The body has an ocular motor system controlling the direction of eye gaze, which means that the eye can only observe a small focused area of the environment. The memory is not photographic, but is a generative model implemented with a variational autoencoder that contains prior knowledge about the environment, and that knowledge is classified. By limiting body and memory abilities and operating according to the FEP, the embodied system repeatedly takes action to obtain the next sensory input based on various potentials of future sensory inputs. In the evaluation, the inference of the environment was represented as an approximate posterior distribution of characters (0 - 9). As the number of repetitions increased, the attention area moved continuously, gradually reducing the uncertainty of characters. Finally, the probability of the correct character became the highest among the characters. Changing the initial attention position provides a different final distribution, suggesting that the proposed system has a confirmation bias.