Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Edgar Granados

Integrating Model-based Control and RL for Sim2Real Transfer of Tight Insertion Policies

May 17, 2025

Isidoros Marougkas, Dhruv Metha Ramesh, Joe H. Doerr, Edgar Granados, Aravind Sivaramakrishnan, Abdeslam Boularias, Kostas E. Bekris

Abstract:Object insertion under tight tolerances ($< \hspace{-.02in} 1mm$) is an important but challenging assembly task as even small errors can result in undesirable contacts. Recent efforts focused on Reinforcement Learning (RL), which often depends on careful definition of dense reward functions. This work proposes an effective strategy for such tasks that integrates traditional model-based control with RL to achieve improved insertion accuracy. The policy is trained exclusively in simulation and is zero-shot transferred to the real system. It employs a potential field-based controller to acquire a model-based policy for inserting a plug into a socket given full observability in simulation. This policy is then integrated with residual RL, which is trained in simulation given only a sparse, goal-reaching reward. A curriculum scheme over observation noise and action magnitude is used for training the residual RL policy. Both policy components use as input the SE(3) poses of both the plug and the socket and return the plug's SE(3) pose transform, which is executed by a robotic arm using a controller. The integrated policy is deployed on the real system without further training or fine-tuning, given a visual SE(3) object tracker. The proposed solution and alternatives are evaluated across a variety of objects and conditions in simulation and reality. The proposed approach outperforms recent RL-based methods in this domain and prior efforts with hybrid policies. Ablations highlight the impact of each component of the approach.

Via

Access Paper or Ask Questions

Kinodynamic Trajectory Following with STELA: Simultaneous Trajectory Estimation & Local Adaptation

Apr 28, 2025

Edgar Granados, Sumanth Tangirala, Kostas E. Bekris

Abstract:State estimation and control are often addressed separately, leading to unsafe execution due to sensing noise, execution errors, and discrepancies between the planning model and reality. Simultaneous control and trajectory estimation using probabilistic graphical models has been proposed as a unified solution to these challenges. Previous work, however, relies heavily on appropriate Gaussian priors and is limited to holonomic robots with linear time-varying models. The current research extends graphical optimization methods to vehicles with arbitrary dynamical models via Simultaneous Trajectory Estimation and Local Adaptation (STELA). The overall approach initializes feasible trajectories using a kinodynamic, sampling-based motion planner. Then, it simultaneously: (i) estimates the past trajectory based on noisy observations, and (ii) adapts the controls to be executed to minimize deviations from the planned, feasible trajectory, while avoiding collisions. The proposed factor graph representation of trajectories in STELA can be applied for any dynamical system given access to first or second-order state update equations, and introduces the duration of execution between two states in the trajectory discretization as an optimization variable. These features provide both generalization and flexibility in trajectory following. In addition to targeting computational efficiency, the proposed strategy performs incremental updates of the factor graph using the iSAM algorithm and introduces a time-window mechanism. This mechanism allows the factor graph to be dynamically updated to operate over a limited history and forward horizon of the planned trajectory. This enables online updates of controls at a minimum of 10Hz. Experiments demonstrate that STELA achieves at least comparable performance to previous frameworks on idealized vehicles with linear dynamics.[...]

* [Accepted] RSS 2025

Via

Access Paper or Ask Questions

${\tt KRAFT}$: Sampling-Based Kinodynamic Replanning and Feedback Control over Approximate, Identified Models of Vehicular Systems

Sep 17, 2024

Aravind Sivaramakrishnan, Sumanth Tangirala, Dhruv Metha Ramesh, Edgar Granados, Kostas E. Bekris

Abstract:This paper aims to increase the safety and reliability of executing trajectories planned for robots with non-trivial dynamics given a light-weight, approximate dynamics model. Scenarios include mobile robots navigating through workspaces with imperfectly modeled surfaces and unknown friction. The proposed approach, Kinodynamic Replanning over Approximate Models with Feedback Tracking (KRAFT), integrates: (i) replanning via an asymptotically optimal sampling-based kinodynamic tree planner, with (ii) trajectory following via feedback control, and (iii) a safety mechanism to reduce collision due to second-order dynamics. The planning and control components use a rough dynamics model expressed analytically via differential equations, which is tuned via system identification (SysId) in a training environment but not the deployed one. This allows the process to be fast and achieve long-horizon reasoning during each replanning cycle. At the same time, the model still includes gaps with reality, even after SysID, in new environments. Experiments demonstrate the limitations of kinematic path planning and path tracking approaches, highlighting the importance of: (a) closing the feedback-loop also at the planning level; and (b) long-horizon reasoning, for safe and efficient trajectory execution given inaccurate models.

Via

Access Paper or Ask Questions

${\tt MORALS}$: Analysis of High-Dimensional Robot Controllers via Topological Tools in a Latent Space

Oct 05, 2023

Ewerton R. Vieira, Aravind Sivaramakrishnan, Sumanth Tangirala, Edgar Granados, Konstantin Mischaikow, Kostas E. Bekris

Abstract:Estimating the region of attraction (${\tt RoA}$) for a robotic system's controller is essential for safe application and controller composition. Many existing methods require access to a closed-form expression that limit applicability to data-driven controllers. Methods that operate only over trajectory rollouts tend to be data-hungry. In prior work, we have demonstrated that topological tools based on Morse Graphs offer data-efficient ${\tt RoA}$ estimation without needing an analytical model. They struggle, however, with high-dimensional systems as they operate over a discretization of the state space. This paper presents ${\it Mo}$rse Graph-aided discovery of ${\it R}$egions of ${\it A}$ttraction in a learned ${\it L}$atent ${\it S}$pace (${\tt MORALS}$). The approach combines autoencoding neural networks with Morse Graphs. ${\tt MORALS}$ shows promising predictive capabilities in estimating attractors and their ${\tt RoA}$s for data-driven controllers operating over high-dimensional systems, including a 67-dim humanoid robot and a 96-dim 3-fingered manipulator. It first projects the dynamics of the controlled system into a learned latent space. Then, it constructs a reduced form of Morse Graphs representing the bistability of the underlying dynamics, i.e., detecting when the controller results in a desired versus an undesired behavior. The evaluation on high-dimensional robotic datasets indicates the data efficiency of the approach in ${\tt RoA}$ estimation.

* The first two authors contributed equally to this paper

Via

Access Paper or Ask Questions

A Survey on the Integration of Machine Learning with Sampling-based Motion Planning

Nov 15, 2022

Troy McMahon, Aravind Sivaramakrishnan, Edgar Granados, Kostas E. Bekris

Abstract:Sampling-based methods are widely adopted solutions for robot motion planning. The methods are straightforward to implement, effective in practice for many robotic systems. It is often possible to prove that they have desirable properties, such as probabilistic completeness and asymptotic optimality. Nevertheless, they still face challenges as the complexity of the underlying planning problem increases, especially under tight computation time constraints, which impact the quality of returned solutions or given inaccurate models. This has motivated machine learning to improve the computational efficiency and applicability of Sampling-Based Motion Planners (SBMPs). This survey reviews such integrative efforts and aims to provide a classification of the alternative directions that have been explored in the literature. It first discusses how learning has been used to enhance key components of SBMPs, such as node sampling, collision detection, distance or nearest neighbor computation, local planning, and termination conditions. Then, it highlights planners that use learning to adaptively select between different implementations of such primitives in response to the underlying problem's features. It also covers emerging methods, which build complete machine learning pipelines that reflect the traditional structure of SBMPs. It also discusses how machine learning has been used to provide data-driven models of robots, which can then be used by a SBMP. Finally, it provides a comparative discussion of the advantages and disadvantages of the approaches covered, and insights on possible future directions of research. An online version of this survey can be found at: https://prx-kinodynamic.github.io/

* Foundations and Trends in Robotics: Vol. 9: No. 4, pp 266-327 (2022)
* First two authors contributed equally

Via

Access Paper or Ask Questions

Data-Efficient Characterization of the Global Dynamics of Robot Controllers with Confidence Guarantees

Oct 04, 2022

Ewerton R. Vieira, Aravind Sivaramakrishnan, Yao Song, Edgar Granados, Marcio Gameiro, Konstantin Mischaikow, Ying Hung, Kostas E. Bekris

Figure 1 for Data-Efficient Characterization of the Global Dynamics of Robot Controllers with Confidence Guarantees

Figure 2 for Data-Efficient Characterization of the Global Dynamics of Robot Controllers with Confidence Guarantees

Figure 3 for Data-Efficient Characterization of the Global Dynamics of Robot Controllers with Confidence Guarantees

Figure 4 for Data-Efficient Characterization of the Global Dynamics of Robot Controllers with Confidence Guarantees

Abstract:This paper proposes an integration of surrogate modeling and topology to significantly reduce the amount of data required to describe the underlying global dynamics of robot controllers, including closed-box ones. A Gaussian Process (GP), trained with randomized short trajectories over the state-space, acts as a surrogate model for the underlying dynamical system. Then, a combinatorial representation is built and used to describe the dynamics in the form of a directed acyclic graph, known as {\it Morse graph}. The Morse graph is able to describe the system's attractors and their corresponding regions of attraction (\roa). Furthermore, a pointwise confidence level of the global dynamics estimation over the entire state space is provided. In contrast to alternatives, the framework does not require estimation of Lyapunov functions, alleviating the need for high prediction accuracy of the GP. The framework is suitable for data-driven controllers that do not expose an analytical model as long as Lipschitz-continuity is satisfied. The method is compared against established analytical and recent machine learning alternatives for estimating \roa s, outperforming them in data efficiency without sacrificing accuracy. Link to code: https://go.rutgers.edu/49hy35en

Via

Access Paper or Ask Questions

USHER: Unbiased Sampling for Hindsight Experience Replay

Jul 03, 2022

Liam Schramm, Yunfu Deng, Edgar Granados, Abdeslam Boularias

Figure 1 for USHER: Unbiased Sampling for Hindsight Experience Replay

Figure 2 for USHER: Unbiased Sampling for Hindsight Experience Replay

Figure 3 for USHER: Unbiased Sampling for Hindsight Experience Replay

Abstract:Dealing with sparse rewards is a long-standing challenge in reinforcement learning (RL). Hindsight Experience Replay (HER) addresses this problem by reusing failed trajectories for one goal as successful trajectories for another. This allows for both a minimum density of reward and for generalization across multiple goals. However, this strategy is known to result in a biased value function, as the update rule underestimates the likelihood of bad outcomes in a stochastic environment. We propose an asymptotically unbiased importance-sampling-based algorithm to address this problem without sacrificing performance on deterministic environments. We show its effectiveness on a range of robotic systems, including challenging high dimensional stochastic environments.

Via

Access Paper or Ask Questions

Morse Graphs: Topological Tools for Analyzing the Global Dynamics of Robot Controllers

Feb 17, 2022

Ewerton R. Vieira, Edgar Granados, Aravind Sivaramakrishnan, Marcio Gameiro, Konstantin Mischaikow, Kostas E. Bekris

Figure 1 for Morse Graphs: Topological Tools for Analyzing the Global Dynamics of Robot Controllers

Figure 2 for Morse Graphs: Topological Tools for Analyzing the Global Dynamics of Robot Controllers

Figure 3 for Morse Graphs: Topological Tools for Analyzing the Global Dynamics of Robot Controllers

Figure 4 for Morse Graphs: Topological Tools for Analyzing the Global Dynamics of Robot Controllers

Abstract:Understanding the global dynamics of a robot controller, such as identifying attractors and their regions of attraction (RoA), is important for safe deployment and synthesizing more effective hybrid controllers. This paper proposes a topological framework to analyze the global dynamics of robot controllers, even data-driven ones, in an effective and explainable way. It builds a combinatorial representation representing the underlying system's state space and non-linear dynamics, which is summarized in a directed acyclic graph, the Morse graph. The approach only probes the dynamics locally by forward propagating short trajectories over a state-space discretization, which needs to be a Lipschitz-continuous function. The framework is evaluated given either numerical or data-driven controllers for classical robotic benchmarks. It is compared against established analytical and recent machine learning alternatives for estimating the RoAs of such controllers. It is shown to outperform them in accuracy and efficiency. It also provides deeper insights as it describes the global dynamics up to the discretization's resolution. This allows to use the Morse graph to identify how to synthesize controllers to form improved hybrid solutions or how to identify the physical limitations of a robotic system.

Via

Access Paper or Ask Questions

Data-Efficient Learning of High-Quality Controls for Kinodynamic Planning used in Vehicular Navigation

Jan 06, 2022

Seth Karten, Aravind Sivaramakrishnan, Edgar Granados, Troy McMahon, Kostas E. Bekris

Figure 1 for Data-Efficient Learning of High-Quality Controls for Kinodynamic Planning used in Vehicular Navigation

Figure 2 for Data-Efficient Learning of High-Quality Controls for Kinodynamic Planning used in Vehicular Navigation

Figure 3 for Data-Efficient Learning of High-Quality Controls for Kinodynamic Planning used in Vehicular Navigation

Figure 4 for Data-Efficient Learning of High-Quality Controls for Kinodynamic Planning used in Vehicular Navigation

Abstract:This paper aims to improve the path quality and computational efficiency of kinodynamic planners used for vehicular systems. It proposes a learning framework for identifying promising controls during the expansion process of sampling-based motion planners for systems with dynamics. Offline, the learning process is trained to return the highest-quality control that reaches a local goal state (i.e., a waypoint) in the absence of obstacles from an input difference vector between its current state and a local goal state. The data generation scheme provides bounds on the target dispersion and uses state space pruning to ensure high-quality controls. By focusing on the system's dynamics, this process is data efficient and takes place once for a dynamical system, so that it can be used for different environments with modular expansion functions. This work integrates the proposed learning process with a) an exploratory expansion function that generates waypoints with biased coverage over the reachable space, and b) proposes an exploitative expansion function for mobile robots, which generates waypoints using medial axis information. This paper evaluates the learning process and the corresponding planners for a first and second-order differential drive systems. The results show that the proposed integration of learning and planning can produce better quality paths than kinodynamic planning with random controls in fewer iterations and computation time.

* Machine Learning for Motion Planning (MLMP) Workshop at ICRA 2021, Xi'an, China
* Presented at the Machine Learning for Motion Planning (MLMP) Workshop at ICRA 2021, Xi'an, China

Via

Access Paper or Ask Questions

Improving Kinodynamic Planners for Vehicular Navigation with Learned Goal-Reaching Controllers

Oct 08, 2021

Aravind Sivaramakrishnan, Edgar Granados, Seth Karten, Troy McMahon, Kostas E. Bekris

Figure 1 for Improving Kinodynamic Planners for Vehicular Navigation with Learned Goal-Reaching Controllers

Figure 2 for Improving Kinodynamic Planners for Vehicular Navigation with Learned Goal-Reaching Controllers

Figure 3 for Improving Kinodynamic Planners for Vehicular Navigation with Learned Goal-Reaching Controllers

Figure 4 for Improving Kinodynamic Planners for Vehicular Navigation with Learned Goal-Reaching Controllers

Abstract:This paper aims to improve the path quality and computational efficiency of sampling-based kinodynamic planners for vehicular navigation. It proposes a learning framework for identifying promising controls during the expansion process of sampling-based planners. Given a dynamics model, a reinforcement learning process is trained offline to return a low-cost control that reaches a local goal state (i.e., a waypoint) in the absence of obstacles. By focusing on the system's dynamics and not knowing the environment, this process is data-efficient and takes place once for a robotic system. In this way, it can be reused in different environments. The planner generates online local goal states for the learned controller in an informed manner to bias towards the goal and consecutively in an exploratory, random manner. For the informed expansion, local goal states are generated either via (a) medial axis information in environments with obstacles, or (b) wavefront information for setups with traversability costs. The learning process and the resulting planning framework are evaluated for a first and second-order differential drive system, as well as a physically simulated Segway robot. The results show that the proposed integration of learning and planning can produce higher quality paths than sampling-based kinodynamic planning with random controls in fewer iterations and computation time.

Via

Access Paper or Ask Questions