Abstract:In robotics, catastrophic interference continues to restrain policy training across environments. Efforts to combat catastrophic interference to date focus on novel neural architectures or training methods, with a recent emphasis on policies with good initial settings that facilitate training in new environments. However, none of these methods to date have taken into account how the physical architecture of the robot can obstruct or facilitate catastrophic interference, just as the choice of neural architecture can. In previous work we have shown how aspects of a robot's physical structure (specifically, sensor placement) can facilitate policy learning by increasing the fraction of optimal policies for a given physical structure. Here we show for the first time that this proxy measure of catastrophic interference correlates with sample efficiency across several search methods, proving that favorable loss landscapes can be induced by the correct choice of physical structure. We show that such structures can be found via co-optimization -- optimization of a robot's structure and control policy simultaneously -- yielding catastrophic interference resistant robot structures and policies, and that this is more efficient than control policy optimization alone. Finally, we show that such structures exhibit sensor homeostasis across environments and introduce this as the mechanism by which certain robots overcome catastrophic interference.
Abstract:Catastrophic forgetting continues to severely restrict the learnability of controllers suitable for multiple task environments. Efforts to combat catastrophic forgetting reported in the literature to date have focused on how control systems can be updated more rapidly, hastening their adjustment from good initial settings to new environments, or more circumspectly, suppressing their ability to overfit to any one environment. When using robots, the environment includes the robot's own body, its shape and material properties, and how its actuators and sensors are distributed along its mechanical structure. Here we demonstrate for the first time how one such design decision (sensor placement) can alter the landscape of the loss function itself, either expanding or shrinking the weight manifolds containing suitable controllers for each individual task, thus increasing or decreasing their probability of overlap across tasks, and thus reducing or inducing the potential for catastrophic forgetting.