Abstract:It is prevalent in contemporary AI and robotics to separately postulate a brain modeled by neural networks and employ it to learn intelligent and adaptive behavior. While this method has worked very well for many types of tasks, it isn't the only type of intelligence that exists in nature. In this work, we study the ways in which intelligent behavior can be created without a separate and explicit brain for robot control, but rather solely as a result of the computation occurring within the physical body of a robot. Specifically, we show that adaptive and complex behavior can be created in voxel-based virtual soft robots by using simple reactive materials that actively change the shape of the robot, and thus its behavior, under different environmental cues. We demonstrate a proof of concept for the idea of closed-loop morphological computation, and show that in our implementation, it enables behavior mimicking logic gates, enabling us to demonstrate how such behaviors may be combined to build up more complex collective behaviors.
Abstract:Finding controllers that perform well across multiple morphologies is an important milestone for large-scale robotics, in line with recent advances via foundation models in other areas of machine learning. However, the challenges of learning a single controller to control multiple morphologies make the `one robot one task' paradigm dominant in the field. To alleviate these challenges, we present a pipeline that: (1) leverages Quality Diversity algorithms like MAP-Elites to create a dataset of many single-task/single-morphology teacher controllers, then (2) distills those diverse controllers into a single multi-morphology controller that performs well across many different body plans by mimicking the sensory-action patterns of the teacher controllers via supervised learning. The distilled controller scales well with the number of teachers/morphologies and shows emergent properties. It generalizes to unseen morphologies in a zero-shot manner, providing robustness to morphological perturbations and instant damage recovery. Lastly, the distilled controller is also independent of the teacher controllers -- we can distill the teacher's knowledge into any controller model, making our approach synergistic with architectural improvements and existing training algorithms for teacher controllers.
Abstract:Evolving virtual creatures is a field with a rich history and recently it has been getting more attention, especially in the soft robotics domain. The compliance of soft materials endows soft robots with complex behavior, but it also makes their design process unintuitive and in need of automated design. Despite the great interest, evolved virtual soft robots lack the complexity, and co-optimization of morphology and control remains a challenging problem. Prior work identifies and investigates a major issue with the co-optimization process -- fragile co-adaptation of brain and body resulting in premature convergence of morphology. In this work, we expand the investigation of this phenomenon by comparing learnable controllers with proprioceptive observations and fixed controllers without any observations, whereas in the latter case, we only have the optimization of the morphology. Our experiments in two morphology spaces and two environments that vary in complexity show, concrete examples of the existence of high-performing regions in the morphology space that are not able to be discovered during the co-optimization of the morphology and control, yet exist and are easily findable when optimizing morphologies alone. Thus this work clearly demonstrates and characterizes the challenges of optimizing morphology during co-optimization. Based on these results, we propose a new body-centric framework to think about the co-optimization problem which helps us understand the issue from a search perspective. We hope the insights we share with this work attract more attention to the problem and help us to enable efficient brain-body co-optimization.
Abstract:This work identifies a simple pre-training mechanism that leads to representations exhibiting better continual and transfer learning. This mechanism -- the repeated resetting of weights in the last layer, which we nickname "zapping" -- was originally designed for a meta-continual-learning procedure, yet we show it is surprisingly applicable in many settings beyond both meta-learning and continual learning. In our experiments, we wish to transfer a pre-trained image classifier to a new set of classes, in a few shots. We show that our zapping procedure results in improved transfer accuracy and/or more rapid adaptation in both standard fine-tuning and continual learning settings, while being simple to implement and computationally efficient. In many cases, we achieve performance on par with state of the art meta-learning without needing the expensive higher-order gradients, by using a combination of zapping and sequential learning. An intuitive explanation for the effectiveness of this zapping procedure is that representations trained with repeated zapping learn features that are capable of rapidly adapting to newly initialized classifiers. Such an approach may be considered a computationally cheaper type of, or alternative to, meta-learning rapidly adaptable features with higher-order gradients. This adds to recent work on the usefulness of resetting neural network parameters during training, and invites further investigation of this mechanism.
Abstract:Real-world problems are often comprised of many objectives and require solutions that carefully trade-off between them. Current approaches to many-objective optimization often require challenging assumptions, like knowledge of the importance/difficulty of objectives in a weighted-sum single-objective paradigm, or enormous populations to overcome the curse of dimensionality in multi-objective Pareto optimization. Combining elements from Many-Objective Evolutionary Algorithms and Quality Diversity algorithms like MAP-Elites, we propose Many-objective Optimization via Voting for Elites (MOVE). MOVE maintains a map of elites that perform well on different subsets of the objective functions. On a 14-objective image-neuroevolution problem, we demonstrate that MOVE is viable with a population of as few as 50 elites and outperforms a naive single-objective baseline. We find that the algorithm's performance relies on solutions jumping across bins (for a parent to produce a child that is elite for a different subset of objectives). We suggest that this type of goal-switching is an implicit method to automatic identification of stepping stones or curriculum learning. We comment on the similarities and differences between MOVE and MAP-Elites, hoping to provide insight to aid in the understanding of that approach $\unicode{x2013}$ and suggest future work that may inform this approach's use for many-objective problems in general.
Abstract:Soft robotics is a rapidly growing area of robotics research that would benefit greatly from design automation, given the challenges of manually engineering complex, compliant, and generally non-intuitive robot body plans and behaviors. It has been suggested that a major hurdle currently limiting soft robot brain-body co-optimization is the fragile specialization between a robot's controller and the particular body plan it controls, resulting in premature convergence. Here we posit that modular controllers are more robust to changes to a robot's body plan. We demonstrate a decreased reduction in locomotion performance after morphological mutations to soft robots with modular controllers, relative to those with similar global controllers - leading to fitter offspring. Moreover, we show that the increased transferability of modular controllers to similar body plans enables more effective brain-body co-optimization of soft robots, resulting in an increased rate of positive morphological mutations and higher overall performance of evolved robots. We hope that this work helps provide specific methods to improve soft robot design automation in this particular setting, while also providing evidence to support our understanding of the challenges of brain-body co-optimization more generally.
Abstract:Existing machines are functionally specific tools that were made for easy prediction and control. Tomorrow's machines may be closer to biological systems in their mutability, resilience, and autonomy. But first they must be capable of learning, and retaining, new information without repeated exposure to it. Past efforts to engineer such systems have sought to build or regulate artificial neural networks using task-specific modules with constrained circumstances of application. This has not yet enabled continual learning over long sequences of previously unseen data without corrupting existing knowledge: a problem known as catastrophic forgetting. In this paper, we introduce a system that can learn sequentially over previously unseen datasets (ImageNet, CIFAR-100) with little forgetting over time. This is accomplished by regulating the activity of weights in a convolutional neural network on the basis of inputs using top-down modulation generated by a second feed-forward neural network. We find that our method learns continually under domain transfer with sparse bursts of activity in weights that are recycled across tasks, rather than by maintaining task-specific modules. Sparse synaptic bursting is found to balance enhanced and diminished activity in a way that facilitates adaptation to new inputs without corrupting previously acquired functions. This behavior emerges during a prior meta-learning phase in which regulated synapses are selectively disinhibited, or grown, from an initial state of uniform suppression.
Abstract:Continual lifelong learning requires an agent or model to learn many sequentially ordered tasks, building on previous knowledge without catastrophically forgetting it. Much work has gone towards preventing the default tendency of machine learning models to catastrophically forget, yet virtually all such work involves manually-designed solutions to the problem. We instead advocate meta-learning a solution to catastrophic forgetting, allowing AI to learn to continually learn. Inspired by neuromodulatory processes in the brain, we propose A Neuromodulated Meta-Learning Algorithm (ANML). It differentiates through a sequential learning process to meta-learn an activation-gating function that enables context-dependent selective activation within a deep neural network. Specifically, a neuromodulatory (NM) neural network gates the forward pass of another (otherwise normal) neural network called the prediction learning network (PLN). The NM network also thus indirectly controls selective plasticity (i.e. the backward pass of) the PLN. ANML enables continual learning without catastrophic forgetting at scale: it produces state-of-the-art continual learning performance, sequentially learning as many as 600 classes (over 9,000 SGD updates).
Abstract:Catastrophic forgetting continues to severely restrict the learnability of controllers suitable for multiple task environments. Efforts to combat catastrophic forgetting reported in the literature to date have focused on how control systems can be updated more rapidly, hastening their adjustment from good initial settings to new environments, or more circumspectly, suppressing their ability to overfit to any one environment. When using robots, the environment includes the robot's own body, its shape and material properties, and how its actuators and sensors are distributed along its mechanical structure. Here we demonstrate for the first time how one such design decision (sensor placement) can alter the landscape of the loss function itself, either expanding or shrinking the weight manifolds containing suitable controllers for each individual task, thus increasing or decreasing their probability of overlap across tasks, and thus reducing or inducing the potential for catastrophic forgetting.
Abstract:Organisms result from adaptive processes interacting across different time scales. One such interaction is that between development and evolution. Models have shown that development sweeps over several traits in a single agent, sometimes exposing promising static traits. Subsequent evolution can then canalize these rare traits. Thus, development can, under the right conditions, increase evolvability. Here, we report on a previously unknown phenomenon when embodied agents are allowed to develop and evolve: Evolution discovers body plans robust to control changes, these body plans become genetically assimilated, yet controllers for these agents are not assimilated. This allows evolution to continue climbing fitness gradients by tinkering with the developmental programs for controllers within these permissive body plans. This exposes a previously unknown detail about the Baldwin effect: instead of all useful traits becoming genetically assimilated, only traits that render the agent robust to changes in other traits become assimilated. We refer to this as differential canalization. This finding also has implications for the evolutionary design of artificial and embodied agents such as robots: robots robust to internal changes in their controllers may also be robust to external changes in their environment, such as transferal from simulation to reality or deployment in novel environments.