Abstract:Obtaining dynamics models is essential for robotics to achieve accurate model-based controllers and simulators for planning. The dynamics models are typically obtained using model specification of the manufacturer or simple numerical methods such as linear regression. However, this approach does not guarantee physically plausible parameters and can only be applied to kinematic chains consisting of rigid bodies. In this article, we describe a differentiable simulator that can be used to identify the system parameters of real-world mechanical systems with complex friction models, holonomic as well as non-holonomic constraints. To guarantee physically consistent parameters, we utilize virtual parameters and gradient-based optimization. The described Differentiable Newton-Euler Algorithm (DiffNEA) can be applied to a class of dynamical systems and guarantees physically plausible predictions. The extensive experimental evaluation shows, that the proposed model learning approach learns accurate dynamics models of systems with complex friction and non-holonomic constraints. Especially in the offline reinforcement learning experiments, the identified DiffNEA models excel. For the challenging ball in a cup task, these models solve the task using model-based offline reinforcement learning on the physical system. The black-box baselines fail on this task in simulation and on the physical system despite using more data for learning the model.
Abstract:A limitation of model-based reinforcement learning (MBRL) is the exploitation of errors in the learned models. Black-box models can fit complex dynamics with high fidelity, but their behavior is undefined outside of the data distribution.Physics-based models are better at extrapolating, due to the general validity of their informed structure, but underfit in the real world due to the presence of unmodeled phenomena. In this work, we demonstrate experimentally that for the offline model-based reinforcement learning setting, physics-based models can be beneficial compared to high-capacity function approximators if the mechanical structure is known. Physics-based models can learn to perform the ball in a cup (BiC) task on a physical manipulator using only 4 minutes of sampled data using offline MBRL. We find that black-box models consistently produce unviable policies for BiC as all predicted trajectories diverge to physically impossible state, despite having access to more data than the physics-based model. In addition, we generalize the approach of physics parameter identification from modeling holonomic multi-body systems to systems with nonholonomic dynamics using end-to-end automatic differentiation. Videos: https://sites.google.com/view/ball-in-a-cup-in-4-minutes/
Abstract:In this work, we examine a spectrum of hybrid model for the domain of multi-body robot dynamics. We motivate a computation graph architecture that embodies the Newton Euler equations, emphasizing the utility of the Lie Algebra form in translating the dynamical geometry into an efficient computational structure for learning. We describe the used virtual parameters that enable unconstrained physical plausible dynamics and the used actuator models. In the experiments, we define a family of 26 grey-box models and evaluate them for system identification of the simulated and physical Furuta Pendulum and Cartpole. The comparison shows that the kinematic parameters, required by previous white-box system identification methods, can be accurately inferred from data. Furthermore, we highlight that models with guaranteed bounded energy of the uncontrolled system generate non-divergent trajectories, while more general models have no such guarantee, so their performance strongly depends on the data distribution. Therefore, the main contributions of this work is the introduction of a white-box model that jointly learns dynamic and kinematics parameters and can be combined with black-box components. We then provide extensive empirical evaluation on challenging systems and different datasets that elucidates the comparative performance of our grey-box architecture with comparable white- and black-box models.