Abstract:Inspired by animals that co-adapt their brain and body to interact with the environment, we present a tendon-driven and over-actuated (i.e., n joint, n+1 actuators) bipedal robot that (i) exploits its backdrivable mechanical properties to manage body-environment interactions without explicit control, and (ii) uses a simple 3-layer neural network to learn to walk after only 2 minutes of 'natural' motor babbling (i.e., an exploration strategy that is compatible with leg and task dynamics; akin to childsplay). This brain-body collaboration first learns to produce feet cyclical movements 'in air' and, without further tuning, can produce locomotion when the biped is lowered to be in slight contact with the ground. In contrast, training with 2 minutes of 'naive' motor babbling (i.e., an exploration strategy that ignores leg task dynamics), does not produce consistent cyclical movements 'in air', and produces erratic movements and no locomotion when in slight contact with the ground. When further lowering the biped and making the desired leg trajectories reach 1cm below ground (causing the desired-vs-obtained trajectories error to be unavoidable), cyclical movements based on either natural or naive babbling presented almost equally persistent trends, and locomotion emerged with naive babbling. Therefore, we show how continual learning of walking in unforeseen circumstances can be driven by continual physical adaptation rooted in the backdrivable properties of the plant and enhanced by exploration strategies that exploit plant dynamics. Our studies also demonstrate that the bio-inspired codesign and co-adaptations of limbs and control strategies can produce locomotion without explicit control of trajectory errors.
Abstract:Error feedback is known to improve performance by correcting control signals in response to perturbations. Here we show how adding simple error feedback can also accelerate and robustify autonomous learning in robots. We implemented two versions of the General-to-Particular (G2P) autonomous learning algorithm to produce multiple movement tasks using a tendon-driven leg with two joints and three tendons: one with and one without kinematic feedback. As expected, feedback improved performance in simulation and hardware. However, we see these improvements even in the presence of sensory delays of up to 100 ms and when experiencing substantial contact collisions. Importantly, feedback accelerates learning by enhancing G2P's continual refinement of the initial inverse map because every experience counts. This allows the system to perform well even after only 60 seconds of initial motor babbling.
Abstract:Robots will become ubiquitously useful only when they can use few attempts to teach themselves to perform different tasks, even with complex bodies and in dynamical environments. Vertebrates, in fact, successfully use trial-and-error to learn multiple tasks in spite of their intricate tendon-driven anatomies. Roboticists find such tendon-driven systems particularly hard to control because they are simultaneously nonlinear, under-determined (many tendon tensions combine to produce few net joint torques), and over-determined (few joint rotations define how many tendons need to be reeled-in/payed-out). We demonstrate---for the first time in simulation and in hardware---how a model-free approach allows few-shot autonomous learning to produce effective locomotion in a 3-tendon/2-joint tendon-driven leg. Initially, an artificial neural network fed by sparsely sampled data collected using motor babbling creates an inverse map from limb kinematics to motor activations, which is analogous to juvenile vertebrates playing during development. Thereafter, iterative reward-driven exploration of candidate motor activations simultaneously refines the inverse map and finds a functional locomotor limit-cycle autonomously. This biologically-inspired algorithm, which we call G2P (General to Particular), enables versatile adaptation of robots to changes in the target task, mechanics of their bodies, and environment. Moreover, this work empowers future studies of few-shot autonomous learning in biological systems, which is the foundation of their enviable functional versatility.