Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ignat Georgiev

DiffSim2Real: Deploying Quadrupedal Locomotion Policies Purely Trained in Differentiable Simulation

Nov 04, 2024

Joshua Bagajo, Clemens Schwarke, Victor Klemm, Ignat Georgiev, Jean-Pierre Sleiman, Jesus Tordesillas, Animesh Garg, Marco Hutter

Figure 1 for DiffSim2Real: Deploying Quadrupedal Locomotion Policies Purely Trained in Differentiable Simulation

Figure 2 for DiffSim2Real: Deploying Quadrupedal Locomotion Policies Purely Trained in Differentiable Simulation

Figure 3 for DiffSim2Real: Deploying Quadrupedal Locomotion Policies Purely Trained in Differentiable Simulation

Abstract:Differentiable simulators provide analytic gradients, enabling more sample-efficient learning algorithms and paving the way for data intensive learning tasks such as learning from images. In this work, we demonstrate that locomotion policies trained with analytic gradients from a differentiable simulator can be successfully transferred to the real world. Typically, simulators that offer informative gradients lack the physical accuracy needed for sim-to-real transfer, and vice-versa. A key factor in our success is a smooth contact model that combines informative gradients with physical accuracy, ensuring effective transfer of learned behaviors. To the best of our knowledge, this is the first time a real quadrupedal robot is able to locomote after training exclusively in a differentiable simulation.

* Presented at the CoRL 2024 Workshop 'Differentiable Optimization Everywhere'

Via

Access Paper or Ask Questions

PWM: Policy Learning with Large World Models

Jul 02, 2024

Ignat Georgiev, Varun Giridhar, Nicklas Hansen, Animesh Garg

Figure 1 for PWM: Policy Learning with Large World Models

Figure 2 for PWM: Policy Learning with Large World Models

Figure 3 for PWM: Policy Learning with Large World Models

Figure 4 for PWM: Policy Learning with Large World Models

Abstract:Reinforcement Learning (RL) has achieved impressive results on complex tasks but struggles in multi-task settings with different embodiments. World models offer scalability by learning a simulation of the environment, yet they often rely on inefficient gradient-free optimization methods. We introduce Policy learning with large World Models (PWM), a novel model-based RL algorithm that learns continuous control policies from large multi-task world models. By pre-training the world model on offline data and using it for first-order gradient policy learning, PWM effectively solves tasks with up to 152 action dimensions and outperforms methods using ground-truth dynamics. Additionally, PWM scales to an 80-task setting, achieving up to 27% higher rewards than existing baselines without the need for expensive online planning. Visualizations and code available at https://policy-world-model.github.io

* Visualizations and code available at https://policy-world-model.github.io

Via

Access Paper or Ask Questions

Adaptive Horizon Actor-Critic for Policy Learning in Contact-Rich Differentiable Simulation

May 28, 2024

Ignat Georgiev, Krishnan Srinivasan, Jie Xu, Eric Heiden, Animesh Garg

Figure 1 for Adaptive Horizon Actor-Critic for Policy Learning in Contact-Rich Differentiable Simulation

Figure 2 for Adaptive Horizon Actor-Critic for Policy Learning in Contact-Rich Differentiable Simulation

Figure 3 for Adaptive Horizon Actor-Critic for Policy Learning in Contact-Rich Differentiable Simulation

Figure 4 for Adaptive Horizon Actor-Critic for Policy Learning in Contact-Rich Differentiable Simulation

Abstract:Model-Free Reinforcement Learning~(MFRL), leveraging the policy gradient theorem, has demonstrated considerable success in continuous control tasks. However, these approaches are plagued by high gradient variance due to zeroth-order gradient estimation, resulting in suboptimal policies. Conversely, First-Order Model-Based Reinforcement Learning~(FO-MBRL) methods, employing differentiable simulation, provide gradients with reduced variance but are susceptible to sampling error in scenarios involving stiff dynamics, such as physical contact. This paper investigates the source of this error and introduces Adaptive Horizon Actor-Critic (AHAC), an FO-MBRL algorithm that reduces gradient error by adapting the model-based horizon to avoid stiff dynamics. Empirical findings reveal that AHAC outperforms MFRL baselines, attaining 40\% more reward across a set of locomotion tasks, and efficiently scaling to high-dimensional control environments with improved wall-clock-time efficiency.

* Website https://adaptive-horizon-actor-critic.github.io/

Via

Access Paper or Ask Questions

Iterative Semi-parametric Dynamics Model Learning For Autonomous Racing

Nov 17, 2020

Ignat Georgiev, Christoforos Chatzikomis, Timo Völkl, Joshua Smith, Michael Mistry

Figure 1 for Iterative Semi-parametric Dynamics Model Learning For Autonomous Racing

Figure 2 for Iterative Semi-parametric Dynamics Model Learning For Autonomous Racing

Figure 3 for Iterative Semi-parametric Dynamics Model Learning For Autonomous Racing

Figure 4 for Iterative Semi-parametric Dynamics Model Learning For Autonomous Racing

Abstract:Accurately modeling robot dynamics is crucial to safe and efficient motion control. In this paper, we develop and apply an iterative learning semi-parametric model, with a neural network, to the task of autonomous racing with a Model Predictive Controller (MPC). We present a novel non-linear semi-parametric dynamics model where we represent the known dynamics with a parametric model, and a neural network captures the unknown dynamics. We show that our model can learn more accurately than a purely parametric model and generalize better than a purely non-parametric model, making it ideal for real-world applications where collecting data from the full state space is not feasible. We present a system where the model is bootstrapped on pre-recorded data and then updated iteratively at run time. Then we apply our iterative learning approach to the simulated problem of autonomous racing and show that it can safely adapt to modified dynamics online and even achieve better performance than models trained on data from manual driving.

* Accepted at 4th Conference on Robot Learning (CoRL 2020)

Via

Access Paper or Ask Questions