Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Avadesh Meduri

Cost Function Estimation Using Inverse Reinforcement Learning with Minimal Observations

May 13, 2025

Sarmad Mehrdad, Avadesh Meduri, Ludovic Righetti

Abstract:We present an iterative inverse reinforcement learning algorithm to infer optimal cost functions in continuous spaces. Based on a popular maximum entropy criteria, our approach iteratively finds a weight improvement step and proposes a method to find an appropriate step size that ensures learned cost function features remain similar to the demonstrated trajectory features. In contrast to similar approaches, our algorithm can individually tune the effectiveness of each observation for the partition function and does not need a large sample set, enabling faster learning. We generate sample trajectories by solving an optimal control problem instead of random sampling, leading to more informative trajectories. The performance of our method is compared to two state of the art algorithms to demonstrate its benefits in several simulated environments.

Via

Access Paper or Ask Questions

Efficient Search and Learning for Agile Locomotion on Stepping Stones

Mar 06, 2024

Adithya Kumar Chinnakkonda Ravi, Victor Dhédin, Armand Jordana, Huaijiang Zhu, Avadesh Meduri, Ludovic Righetti, Bernhard Schölkopf, Majid Khadiv

Figure 1 for Efficient Search and Learning for Agile Locomotion on Stepping Stones

Figure 2 for Efficient Search and Learning for Agile Locomotion on Stepping Stones

Figure 3 for Efficient Search and Learning for Agile Locomotion on Stepping Stones

Figure 4 for Efficient Search and Learning for Agile Locomotion on Stepping Stones

Abstract:Legged robots have become capable of performing highly dynamic maneuvers in the past few years. However, agile locomotion in highly constrained environments such as stepping stones is still a challenge. In this paper, we propose a combination of model-based control, search, and learning to design efficient control policies for agile locomotion on stepping stones. In our framework, we use nonlinear model predictive control (NMPC) to generate whole-body motions for a given contact plan. To efficiently search for an optimal contact plan, we propose to use Monte Carlo tree search (MCTS). While the combination of MCTS and NMPC can quickly find a feasible plan for a given environment (a few seconds), it is not yet suitable to be used as a reactive policy. Hence, we generate a dataset for optimal goal-conditioned policy for a given scene and learn it through supervised learning. In particular, we leverage the power of diffusion models in handling multi-modality in the dataset. We test our proposed framework on a scenario where our quadruped robot Solo12 successfully jumps to different goals in a highly constrained environment.

Via

Access Paper or Ask Questions

Risk-Sensitive Extended Kalman Filter

May 19, 2023

Armand Jordana, Avadesh Meduri, Etienne Arlaud, Justin Carpentier, Ludovic Righetti

Abstract:In robotics, designing robust algorithms in the face of estimation uncertainty is a challenging task. Indeed, controllers often do not consider the estimation uncertainty and only rely on the most likely estimated state. Consequently, sudden changes in the environment or the robot's dynamics can lead to catastrophic behaviors. In this work, we present a risk-sensitive Extended Kalman Filter that allows doing output-feedback Model Predictive Control (MPC) safely. This filter adapts its estimation to the control objective. By taking a pessimistic estimate concerning the value function resulting from the MPC controller, the filter provides increased robustness to the controller in phases of uncertainty as compared to a standard Extended Kalman Filter (EKF). Moreover, the filter has the same complexity as an EKF, so that it can be used for real-time model-predictive control. The paper evaluates the risk-sensitive behavior of the proposed filter when used in a nonlinear model-predictive control loop on a planar drone and industrial manipulator in simulation, as well as on an external force estimation task on a real quadruped robot. These experiments demonstrate the abilities of the approach to improve performance in the face of uncertainties significantly.

Via

Access Paper or Ask Questions

Visual-Inertial and Leg Odometry Fusion for Dynamic Locomotion

Oct 10, 2022

Victor Dhédin, Haolong Li, Shahram Khorshidi, Lukas Mack, Adithya Kumar Chinnakkonda Ravi, Avadesh Meduri, Paarth Shah, Felix Grimminger, Ludovic Righetti, Majid Khadiv(+1 more)

Figure 1 for Visual-Inertial and Leg Odometry Fusion for Dynamic Locomotion

Figure 2 for Visual-Inertial and Leg Odometry Fusion for Dynamic Locomotion

Figure 3 for Visual-Inertial and Leg Odometry Fusion for Dynamic Locomotion

Figure 4 for Visual-Inertial and Leg Odometry Fusion for Dynamic Locomotion

Abstract:Implementing dynamic locomotion behaviors on legged robots requires a high-quality state estimation module. Especially when the motion includes flight phases, state-of-the-art approaches fail to produce reliable estimation of the robot posture, in particular base height. In this paper, we propose a novel approach for combining visual-inertial odometry (VIO) with leg odometry in an extended Kalman filter (EKF) based state estimator. The VIO module uses a stereo camera and IMU to yield low-drift 3D position and yaw orientation and drift-free pitch and roll orientation of the robot base link in the inertial frame. However, these values have a considerable amount of latency due to image processing and optimization, while the rate of update is quite low which is not suitable for low-level control. To reduce the latency, we predict the VIO state estimate at the rate of the IMU measurements of the VIO sensor. The EKF module uses the base pose and linear velocity predicted by VIO, fuses them further with a second high-rate IMU and leg odometry measurements, and produces robot state estimates with a high frequency and small latency suitable for control. We integrate this lightweight estimation framework with a nonlinear model predictive controller and show successful implementation of a set of agile locomotion behaviors, including trotting and jumping at varying horizontal speeds, on a torque-controlled quadruped robot.

* Submitted to IEEE International Conference on Robotics and Automation (ICRA), 2023

Via

Access Paper or Ask Questions

ContactNet: Online Multi-Contact Planning for Acyclic Legged Robot Locomotion

Sep 30, 2022

Angelo Bratta, Avadesh Meduri, Michele Focchi, Ludovic Righetti, Claudio Semini

Figure 1 for ContactNet: Online Multi-Contact Planning for Acyclic Legged Robot Locomotion

Figure 2 for ContactNet: Online Multi-Contact Planning for Acyclic Legged Robot Locomotion

Figure 3 for ContactNet: Online Multi-Contact Planning for Acyclic Legged Robot Locomotion

Figure 4 for ContactNet: Online Multi-Contact Planning for Acyclic Legged Robot Locomotion

Abstract:Online trajectory optimization techniques generally depend on heuristic-based contact planners in order to have low computation times and achieve high replanning frequencies. In this work, we propose ContactNet, a fast acyclic contact planner based on a multi-output regression neural network. ContactNet ranks discretized stepping regions, allowing to quickly choose the best feasible solution, even in complex environments. The low computation time, in the order of 1 ms, makes possible the execution of the contact planner concurrently with a trajectory optimizer in a Model Predictive Control (MPC) fashion. We demonstrate the effectiveness of the approach in simulation in different complex scenarios with the quadruped robot Solo12.

Via

Access Paper or Ask Questions

MPC with Sensor-Based Online Cost Adaptation

Sep 20, 2022

Avadesh Meduri, Huaijiang Zhu, Armand Jordana, Ludovic Righetti

Figure 1 for MPC with Sensor-Based Online Cost Adaptation

Figure 2 for MPC with Sensor-Based Online Cost Adaptation

Figure 3 for MPC with Sensor-Based Online Cost Adaptation

Figure 4 for MPC with Sensor-Based Online Cost Adaptation

Abstract:Model predictive control is a powerful tool to generate complex motions for robots. However, it often requires solving non-convex problems online to produce rich behaviors, which is computationally expensive and not always practical in real time. Additionally, direct integration of high dimensional sensor data (e.g. RGB-D images) in the feedback loop is challenging with current state-space methods. This paper aims to address both issues. It introduces a model predictive control scheme, where a neural network constantly updates the cost function of a quadratic program based on sensory inputs, aiming to minimize a general non-convex task loss without solving a non-convex problem online. By updating the cost, the robot is able to adapt to changes in the environment directly from sensor measurement without requiring a new cost design. Furthermore, since the quadratic program can be solved efficiently with hard constraints, a safe deployment on the robot is ensured. Experiments with a wide variety of reaching tasks on an industrial robot manipulator demonstrate that our method can efficiently solve complex non-convex problems with high-dimensional visual sensory inputs, while still being robust to external disturbances.

* 6 Pages, 5 Figures

Via

Access Paper or Ask Questions

BiConMP: A Nonlinear Model Predictive Control Framework for Whole Body Motion Planning

Jan 19, 2022

Avadesh Meduri, Paarth Shah, Julian Viereck, Majid Khadiv, Ioannis Havoutis, Ludovic Righetti

Figure 1 for BiConMP: A Nonlinear Model Predictive Control Framework for Whole Body Motion Planning

Figure 2 for BiConMP: A Nonlinear Model Predictive Control Framework for Whole Body Motion Planning

Figure 3 for BiConMP: A Nonlinear Model Predictive Control Framework for Whole Body Motion Planning

Figure 4 for BiConMP: A Nonlinear Model Predictive Control Framework for Whole Body Motion Planning

Abstract:Online planning of whole-body motions for legged robots is challenging due to the inherent nonlinearity in the robot dynamics. In this work, we propose a nonlinear MPC framework, the BiConMP which can generate whole body trajectories online by efficiently exploiting the structure of the robot dynamics. BiConMP is used to generate various cyclic gaits on a real quadruped robot and its performance is evaluated on different terrain, countering unforeseen pushes and transitioning online between different gaits. Further, the ability of BiConMP to generate non-trivial acyclic whole-body dynamic motions on the robot is presented. Finally, an extensive empirical analysis on the effects of planning horizon and frequency on the nonlinear MPC framework is reported and discussed.

Via

Access Paper or Ask Questions

ValueNetQP: Learned one-step optimal control for legged locomotion

Jan 11, 2022

Julian Viereck, Avadesh Meduri, Ludovic Righetti

Figure 1 for ValueNetQP: Learned one-step optimal control for legged locomotion

Figure 2 for ValueNetQP: Learned one-step optimal control for legged locomotion

Figure 3 for ValueNetQP: Learned one-step optimal control for legged locomotion

Figure 4 for ValueNetQP: Learned one-step optimal control for legged locomotion

Abstract:Optimal control is a successful approach to generate motions for complex robots, in particular for legged locomotion. However, these techniques are often too slow to run in real time for model predictive control or one needs to drastically simplify the dynamics model. In this work, we present a method to learn to predict the gradient and hessian of the problem value function, enabling fast resolution of the predictive control problem with a one-step quadratic program. In addition, our method is able to satisfy constraints like friction cones and unilateral constraints, which are important for high dynamics locomotion tasks. We demonstrate the capability of our method in simulation and on a real quadruped robot performing trotting and bounding motions.

Via

Access Paper or Ask Questions

Rapid Convex Optimization of Centroidal Dynamics using Block Coordinate Descent

Aug 04, 2021

Paarth Shah, Avadesh Meduri, Wolfgang Merkt, Majid Khadiv, Ioannis Havoutis, Ludovic Righetti

Figure 1 for Rapid Convex Optimization of Centroidal Dynamics using Block Coordinate Descent

Figure 2 for Rapid Convex Optimization of Centroidal Dynamics using Block Coordinate Descent

Figure 3 for Rapid Convex Optimization of Centroidal Dynamics using Block Coordinate Descent

Figure 4 for Rapid Convex Optimization of Centroidal Dynamics using Block Coordinate Descent

Abstract:In this paper we explore the use of block coordinate descent (BCD) to optimize the centroidal momentum dynamics for dynamically consistent multi-contact behaviors. The centroidal dynamics have recently received a large amount of attention in order to create physically realizable motions for robots with hands and feet while being computationally more tractable than full rigid body dynamics models. Our contribution lies in exploiting the structure of the dynamics in order to simplify the original non-convex problem into two convex subproblems. We iterate between these two subproblems for a set number of iterations or until a consensus is reached. We explore the properties of the proposed optimization method for the centroidal dynamics and verify in simulation that motions generated by our approach can be tracked by the quadruped Solo12. In addition, we compare our method to a recently proposed convexification using a sequence of convex relaxations as well as a more standard interior point method used in the off- the-shelf solver IPOPT to show that our approach finds similar, if not better, trajectories (in terms of cost), and is more than four times faster than both approaches. Finally, compared to previous approaches, we note its practicality due to the convex nature of each subproblem which allows our method to be used with any off-the-shelf quadratic programming solver.

Via

Access Paper or Ask Questions

DeepQ Stepper: A framework for reactive dynamic walking on uneven terrain

Oct 28, 2020

Avadesh Meduri, Majid Khadiv, Ludovic Righetti

Figure 1 for DeepQ Stepper: A framework for reactive dynamic walking on uneven terrain

Figure 2 for DeepQ Stepper: A framework for reactive dynamic walking on uneven terrain

Figure 3 for DeepQ Stepper: A framework for reactive dynamic walking on uneven terrain

Figure 4 for DeepQ Stepper: A framework for reactive dynamic walking on uneven terrain

Abstract:Reactive stepping and push recovery for biped robots is often restricted to flat terrains because of the difficulty in computing capture regions for nonlinear dynamic models. In this paper, we address this limitation by using reinforcement learning to approximately learn the 3D capture region for such systems. We propose a novel 3D reactive stepper, The DeepQ stepper, that computes optimal step locations for walking at different velocities using the 3D capture regions approximated by the action-value function. We demonstrate the ability of the approach to learn stepping with a simplified 3D pendulum model and a full robot dynamics. Further, the stepper achieves a higher performance when it learns approximate capture regions while taking into account the entire dynamics of the robot that are often ignored in existing reactive steppers based on simplified models. The DeepQ stepper can handle non convex terrain with obstacles, walk on restricted surfaces like stepping stones and recover from external disturbances for a constant computational cost.

Via

Access Paper or Ask Questions