Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Aravind Sivaramakrishnan

Integrating Model-based Control and RL for Sim2Real Transfer of Tight Insertion Policies

May 17, 2025

Isidoros Marougkas, Dhruv Metha Ramesh, Joe H. Doerr, Edgar Granados, Aravind Sivaramakrishnan, Abdeslam Boularias, Kostas E. Bekris

Abstract:Object insertion under tight tolerances ($< \hspace{-.02in} 1mm$) is an important but challenging assembly task as even small errors can result in undesirable contacts. Recent efforts focused on Reinforcement Learning (RL), which often depends on careful definition of dense reward functions. This work proposes an effective strategy for such tasks that integrates traditional model-based control with RL to achieve improved insertion accuracy. The policy is trained exclusively in simulation and is zero-shot transferred to the real system. It employs a potential field-based controller to acquire a model-based policy for inserting a plug into a socket given full observability in simulation. This policy is then integrated with residual RL, which is trained in simulation given only a sparse, goal-reaching reward. A curriculum scheme over observation noise and action magnitude is used for training the residual RL policy. Both policy components use as input the SE(3) poses of both the plug and the socket and return the plug's SE(3) pose transform, which is executed by a robotic arm using a controller. The integrated policy is deployed on the real system without further training or fine-tuning, given a visual SE(3) object tracker. The proposed solution and alternatives are evaluated across a variety of objects and conditions in simulation and reality. The proposed approach outperforms recent RL-based methods in this domain and prior efforts with hybrid policies. Ablations highlight the impact of each component of the approach.

Via

Access Paper or Ask Questions

PROBE: Proprioceptive Obstacle Detection and Estimation while Navigating in Clutter

May 17, 2025

Dhruv Metha Ramesh, Aravind Sivaramakrishnan, Shreesh Keskar, Kostas E. Bekris, Jingjin Yu, Abdeslam Boularias

Abstract:In critical applications, including search-and-rescue in degraded environments, blockages can be prevalent and prevent the effective deployment of certain sensing modalities, particularly vision, due to occlusion and the constrained range of view of onboard camera sensors. To enable robots to tackle these challenges, we propose a new approach, Proprioceptive Obstacle Detection and Estimation while navigating in clutter PROBE, which instead relies only on the robot's proprioception to infer the presence or absence of occluded rectangular obstacles while predicting their dimensions and poses in SE(2). The proposed approach is a Transformer neural network that receives as input a history of applied torques and sensed whole-body movements of the robot and returns a parameterized representation of the obstacles in the environment. The effectiveness of PROBE is evaluated on simulated environments in Isaac Gym and with a real Unitree Go1 quadruped robot.

Via

Access Paper or Ask Questions

${\tt KRAFT}$: Sampling-Based Kinodynamic Replanning and Feedback Control over Approximate, Identified Models of Vehicular Systems

Sep 17, 2024

Aravind Sivaramakrishnan, Sumanth Tangirala, Dhruv Metha Ramesh, Edgar Granados, Kostas E. Bekris

Abstract:This paper aims to increase the safety and reliability of executing trajectories planned for robots with non-trivial dynamics given a light-weight, approximate dynamics model. Scenarios include mobile robots navigating through workspaces with imperfectly modeled surfaces and unknown friction. The proposed approach, Kinodynamic Replanning over Approximate Models with Feedback Tracking (KRAFT), integrates: (i) replanning via an asymptotically optimal sampling-based kinodynamic tree planner, with (ii) trajectory following via feedback control, and (iii) a safety mechanism to reduce collision due to second-order dynamics. The planning and control components use a rough dynamics model expressed analytically via differential equations, which is tuned via system identification (SysId) in a training environment but not the deployed one. This allows the process to be fast and achieve long-horizon reasoning during each replanning cycle. At the same time, the model still includes gaps with reality, even after SysID, in new environments. Experiments demonstrate the limitations of kinematic path planning and path tracking approaches, highlighting the importance of: (a) closing the feedback-loop also at the planning level; and (b) long-horizon reasoning, for safe and efficient trajectory execution given inaccurate models.

Via

Access Paper or Ask Questions

Roadmaps with Gaps over Controllers: Achieving Efficiency in Planning under Dynamics

Oct 05, 2023

Aravind Sivaramakrishnan, Noah R. Carver, Sumanth Tangirala, Kostas E. Bekris

Abstract:This paper aims to improve the computational efficiency of motion planning for mobile robots with non-trivial dynamics by taking advantage of learned controllers. It adopts a decoupled strategy, where a system-specific controller is first trained offline in an empty environment to deal with the system's dynamics. For an environment, the proposed approach constructs offline a data structure, a "Roadmap with Gaps," to approximately learn how to solve planning queries in this environment using the learned controller. Its nodes correspond to local regions and edges correspond to applications of the learned control policy that approximately connect these regions. Gaps arise due to the controller not perfectly connecting pairs of individual states along edges. Online, given a query, a tree sampling-based motion planner uses the roadmap so that the tree's expansion is informed towards the goal region. The tree expansion selects local subgoals given a wavefront on the roadmap that guides towards the goal. When the controller cannot reach a subgoal region, the planner resorts to random exploration to maintain probabilistic completeness and asymptotic optimality. The experimental evaluation shows that the approach significantly improves the computational efficiency of motion planning on various benchmarks, including physics-based vehicular models on uneven and varying friction terrains as well as a quadrotor under air pressure effects.

Via

Access Paper or Ask Questions

${\tt MORALS}$: Analysis of High-Dimensional Robot Controllers via Topological Tools in a Latent Space

Oct 05, 2023

Ewerton R. Vieira, Aravind Sivaramakrishnan, Sumanth Tangirala, Edgar Granados, Konstantin Mischaikow, Kostas E. Bekris

Abstract:Estimating the region of attraction (${\tt RoA}$) for a robotic system's controller is essential for safe application and controller composition. Many existing methods require access to a closed-form expression that limit applicability to data-driven controllers. Methods that operate only over trajectory rollouts tend to be data-hungry. In prior work, we have demonstrated that topological tools based on Morse Graphs offer data-efficient ${\tt RoA}$ estimation without needing an analytical model. They struggle, however, with high-dimensional systems as they operate over a discretization of the state space. This paper presents ${\it Mo}$rse Graph-aided discovery of ${\it R}$egions of ${\it A}$ttraction in a learned ${\it L}$atent ${\it S}$pace (${\tt MORALS}$). The approach combines autoencoding neural networks with Morse Graphs. ${\tt MORALS}$ shows promising predictive capabilities in estimating attractors and their ${\tt RoA}$s for data-driven controllers operating over high-dimensional systems, including a 67-dim humanoid robot and a 96-dim 3-fingered manipulator. It first projects the dynamics of the controlled system into a learned latent space. Then, it constructs a reduced form of Morse Graphs representing the bistability of the underlying dynamics, i.e., detecting when the controller results in a desired versus an undesired behavior. The evaluation on high-dimensional robotic datasets indicates the data efficiency of the approach in ${\tt RoA}$ estimation.

* The first two authors contributed equally to this paper

Via

Access Paper or Ask Questions

A Survey on the Integration of Machine Learning with Sampling-based Motion Planning

Nov 15, 2022

Troy McMahon, Aravind Sivaramakrishnan, Edgar Granados, Kostas E. Bekris

Abstract:Sampling-based methods are widely adopted solutions for robot motion planning. The methods are straightforward to implement, effective in practice for many robotic systems. It is often possible to prove that they have desirable properties, such as probabilistic completeness and asymptotic optimality. Nevertheless, they still face challenges as the complexity of the underlying planning problem increases, especially under tight computation time constraints, which impact the quality of returned solutions or given inaccurate models. This has motivated machine learning to improve the computational efficiency and applicability of Sampling-Based Motion Planners (SBMPs). This survey reviews such integrative efforts and aims to provide a classification of the alternative directions that have been explored in the literature. It first discusses how learning has been used to enhance key components of SBMPs, such as node sampling, collision detection, distance or nearest neighbor computation, local planning, and termination conditions. Then, it highlights planners that use learning to adaptively select between different implementations of such primitives in response to the underlying problem's features. It also covers emerging methods, which build complete machine learning pipelines that reflect the traditional structure of SBMPs. It also discusses how machine learning has been used to provide data-driven models of robots, which can then be used by a SBMP. Finally, it provides a comparative discussion of the advantages and disadvantages of the approaches covered, and insights on possible future directions of research. An online version of this survey can be found at: https://prx-kinodynamic.github.io/

* Foundations and Trends in Robotics: Vol. 9: No. 4, pp 266-327 (2022)
* First two authors contributed equally

Via

Access Paper or Ask Questions

Data-Efficient Characterization of the Global Dynamics of Robot Controllers with Confidence Guarantees

Oct 04, 2022

Ewerton R. Vieira, Aravind Sivaramakrishnan, Yao Song, Edgar Granados, Marcio Gameiro, Konstantin Mischaikow, Ying Hung, Kostas E. Bekris

Figure 1 for Data-Efficient Characterization of the Global Dynamics of Robot Controllers with Confidence Guarantees

Figure 2 for Data-Efficient Characterization of the Global Dynamics of Robot Controllers with Confidence Guarantees

Figure 3 for Data-Efficient Characterization of the Global Dynamics of Robot Controllers with Confidence Guarantees

Figure 4 for Data-Efficient Characterization of the Global Dynamics of Robot Controllers with Confidence Guarantees

Abstract:This paper proposes an integration of surrogate modeling and topology to significantly reduce the amount of data required to describe the underlying global dynamics of robot controllers, including closed-box ones. A Gaussian Process (GP), trained with randomized short trajectories over the state-space, acts as a surrogate model for the underlying dynamical system. Then, a combinatorial representation is built and used to describe the dynamics in the form of a directed acyclic graph, known as {\it Morse graph}. The Morse graph is able to describe the system's attractors and their corresponding regions of attraction (\roa). Furthermore, a pointwise confidence level of the global dynamics estimation over the entire state space is provided. In contrast to alternatives, the framework does not require estimation of Lyapunov functions, alleviating the need for high prediction accuracy of the GP. The framework is suitable for data-driven controllers that do not expose an analytical model as long as Lipschitz-continuity is satisfied. The method is compared against established analytical and recent machine learning alternatives for estimating \roa s, outperforming them in data efficiency without sacrificing accuracy. Link to code: https://go.rutgers.edu/49hy35en

Via

Access Paper or Ask Questions

Morse Graphs: Topological Tools for Analyzing the Global Dynamics of Robot Controllers

Feb 17, 2022

Ewerton R. Vieira, Edgar Granados, Aravind Sivaramakrishnan, Marcio Gameiro, Konstantin Mischaikow, Kostas E. Bekris

Figure 1 for Morse Graphs: Topological Tools for Analyzing the Global Dynamics of Robot Controllers

Figure 2 for Morse Graphs: Topological Tools for Analyzing the Global Dynamics of Robot Controllers

Figure 3 for Morse Graphs: Topological Tools for Analyzing the Global Dynamics of Robot Controllers

Figure 4 for Morse Graphs: Topological Tools for Analyzing the Global Dynamics of Robot Controllers

Abstract:Understanding the global dynamics of a robot controller, such as identifying attractors and their regions of attraction (RoA), is important for safe deployment and synthesizing more effective hybrid controllers. This paper proposes a topological framework to analyze the global dynamics of robot controllers, even data-driven ones, in an effective and explainable way. It builds a combinatorial representation representing the underlying system's state space and non-linear dynamics, which is summarized in a directed acyclic graph, the Morse graph. The approach only probes the dynamics locally by forward propagating short trajectories over a state-space discretization, which needs to be a Lipschitz-continuous function. The framework is evaluated given either numerical or data-driven controllers for classical robotic benchmarks. It is compared against established analytical and recent machine learning alternatives for estimating the RoAs of such controllers. It is shown to outperform them in accuracy and efficiency. It also provides deeper insights as it describes the global dynamics up to the discretization's resolution. This allows to use the Morse graph to identify how to synthesize controllers to form improved hybrid solutions or how to identify the physical limitations of a robotic system.

Via

Access Paper or Ask Questions

Data-Efficient Learning of High-Quality Controls for Kinodynamic Planning used in Vehicular Navigation

Jan 06, 2022

Seth Karten, Aravind Sivaramakrishnan, Edgar Granados, Troy McMahon, Kostas E. Bekris

Figure 1 for Data-Efficient Learning of High-Quality Controls for Kinodynamic Planning used in Vehicular Navigation

Figure 2 for Data-Efficient Learning of High-Quality Controls for Kinodynamic Planning used in Vehicular Navigation

Figure 3 for Data-Efficient Learning of High-Quality Controls for Kinodynamic Planning used in Vehicular Navigation

Figure 4 for Data-Efficient Learning of High-Quality Controls for Kinodynamic Planning used in Vehicular Navigation

Abstract:This paper aims to improve the path quality and computational efficiency of kinodynamic planners used for vehicular systems. It proposes a learning framework for identifying promising controls during the expansion process of sampling-based motion planners for systems with dynamics. Offline, the learning process is trained to return the highest-quality control that reaches a local goal state (i.e., a waypoint) in the absence of obstacles from an input difference vector between its current state and a local goal state. The data generation scheme provides bounds on the target dispersion and uses state space pruning to ensure high-quality controls. By focusing on the system's dynamics, this process is data efficient and takes place once for a dynamical system, so that it can be used for different environments with modular expansion functions. This work integrates the proposed learning process with a) an exploratory expansion function that generates waypoints with biased coverage over the reachable space, and b) proposes an exploitative expansion function for mobile robots, which generates waypoints using medial axis information. This paper evaluates the learning process and the corresponding planners for a first and second-order differential drive systems. The results show that the proposed integration of learning and planning can produce better quality paths than kinodynamic planning with random controls in fewer iterations and computation time.

* Machine Learning for Motion Planning (MLMP) Workshop at ICRA 2021, Xi'an, China
* Presented at the Machine Learning for Motion Planning (MLMP) Workshop at ICRA 2021, Xi'an, China

Via

Access Paper or Ask Questions

Improving Kinodynamic Planners for Vehicular Navigation with Learned Goal-Reaching Controllers

Oct 08, 2021

Aravind Sivaramakrishnan, Edgar Granados, Seth Karten, Troy McMahon, Kostas E. Bekris

Figure 1 for Improving Kinodynamic Planners for Vehicular Navigation with Learned Goal-Reaching Controllers

Figure 2 for Improving Kinodynamic Planners for Vehicular Navigation with Learned Goal-Reaching Controllers

Figure 3 for Improving Kinodynamic Planners for Vehicular Navigation with Learned Goal-Reaching Controllers

Figure 4 for Improving Kinodynamic Planners for Vehicular Navigation with Learned Goal-Reaching Controllers

Abstract:This paper aims to improve the path quality and computational efficiency of sampling-based kinodynamic planners for vehicular navigation. It proposes a learning framework for identifying promising controls during the expansion process of sampling-based planners. Given a dynamics model, a reinforcement learning process is trained offline to return a low-cost control that reaches a local goal state (i.e., a waypoint) in the absence of obstacles. By focusing on the system's dynamics and not knowing the environment, this process is data-efficient and takes place once for a robotic system. In this way, it can be reused in different environments. The planner generates online local goal states for the learned controller in an informed manner to bias towards the goal and consecutively in an exploratory, random manner. For the informed expansion, local goal states are generated either via (a) medial axis information in environments with obstacles, or (b) wavefront information for setups with traversability costs. The learning process and the resulting planning framework are evaluated for a first and second-order differential drive system, as well as a physically simulated Segway robot. The results show that the proposed integration of learning and planning can produce higher quality paths than sampling-based kinodynamic planning with random controls in fewer iterations and computation time.

Via

Access Paper or Ask Questions