Abstract:This paper presents the Sensorimotor Transformer (SMT), a vision model inspired by human saccadic eye movements that prioritize high-saliency regions in visual input to enhance computational efficiency and reduce memory consumption. Unlike traditional models that process all image patches uniformly, SMT identifies and selects the most salient patches based on intrinsic two-dimensional (i2D) features, such as corners and occlusions, which are known to convey high-information content and align with human fixation patterns. The SMT architecture uses this biological principle to leverage vision transformers to process only the most informative patches, allowing for a substantial reduction in memory usage that scales with the sequence length of selected patches. This approach aligns with visual neuroscience findings, suggesting that the human visual system optimizes information gathering through selective, spatially dynamic focus. Experimental evaluations on Imagenet-1k demonstrate that SMT achieves competitive top-1 accuracy while significantly reducing memory consumption and computational complexity, particularly when a limited number of patches is used. This work introduces a saccade-like selection mechanism into transformer-based vision models, offering an efficient alternative for image analysis and providing new insights into biologically motivated architectures for resource-constrained applications.
Abstract:In everyday life, we perform tasks (e.g., cooking or cleaning) that involve a large variety of objects and goals. When confronted with an unexpected or unwanted outcome, we take corrective actions and try again until achieving the desired result. The reasoning performed to identify a cause of the observed outcome and to select an appropriate corrective action is a crucial aspect of human reasoning for successful task execution. Central to this reasoning is the assumption that a factor is responsible for producing the observed outcome. In this paper, we investigate the use of probabilistic actual causation to determine whether a factor is the cause of an observed undesired outcome. Furthermore, we show how the actual causation probabilities can be used to find alternative actions to change the outcome. We apply the probabilistic actual causation analysis to a robot pouring task. When spillage occurs, the analysis indicates whether a task parameter is the cause and how it should be changed to avoid spillage. The analysis requires a causal graph of the task and the corresponding conditional probability distributions. To fulfill these requirements, we perform a complete causal modeling procedure (i.e., task analysis, definition of variables, determination of the causal graph structure, and estimation of conditional probability distributions) using data from a realistic simulation of the robot pouring task, covering a large combinatorial space of task parameters. Based on the results, we discuss the implications of the variables' representation and how the alternative actions suggested by the actual causation analysis would compare to the alternative solutions proposed by a human observer. The practical use of the analysis of probabilistic actual causation to select alternative action parameters is demonstrated.
Abstract:A control strategy for expert systems is presented which is based on Shafer's Belief theory and the combination rule of Dempster. In contrast to well known strategies it is not sequentially and hypotheses-driven, but parallel and self organizing, determined by the concept of information gain. The information gain, calculated as the maximal difference between the actual evidence distribution in the knowledge base and the potential evidence determines each consultation step. Hierarchically structured knowledge is an important representation form and experts even use several hierarchies in parallel for constituting their knowledge. Hence the control strategy is applied to a layered set of distinct hierarchies. Depending on the actual data one of these hierarchies is chosen by the control strategy for the next step in the reasoning process. Provided the actual data are well matched to the structure of one hierarchy, this hierarchy remains selected for a longer consultation time. If no good match can be achieved, a switch from the actual hierarchy to a competing one will result, very similar to the phenomenon of restructuring in problem solving tasks. Up to now the control strategy is restricted to multi hierarchical knowledge bases with disjunct hierarchies. It is implemented in the expert system IBIG (inference by information gain), being presently applied to acquired speech disorders (aphasia).