Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Sayantan Auddy

ASIAA, NCTS Physics Division

Safe Continual Domain Adaptation after Sim2Real Transfer of Reinforcement Learning Policies in Robotics

Mar 13, 2025

Josip Josifovski, Shangding Gu, Mohammadhossein Malmir, Haoliang Huang, Sayantan Auddy, Nicolás Navarro-Guerrero, Costas Spanos, Alois Knoll

Abstract:Domain randomization has emerged as a fundamental technique in reinforcement learning (RL) to facilitate the transfer of policies from simulation to real-world robotic applications. Many existing domain randomization approaches have been proposed to improve robustness and sim2real transfer. These approaches rely on wide randomization ranges to compensate for the unknown actual system parameters, leading to robust but inefficient real-world policies. In addition, the policies pretrained in the domain-randomized simulation are fixed after deployment due to the inherent instability of the optimization processes based on RL and the necessity of sampling exploitative but potentially unsafe actions on the real system. This limits the adaptability of the deployed policy to the inevitably changing system parameters or environment dynamics over time. We leverage safe RL and continual learning under domain-randomized simulation to address these limitations and enable safe deployment-time policy adaptation in real-world robot control. The experiments show that our method enables the policy to adapt and fit to the current domain distribution and environment dynamics of the real system while minimizing safety risks and avoiding issues like catastrophic forgetting of the general policy found in randomized simulation during the pretraining phase. Videos and supplementary material are available at https://safe-cda.github.io/.

* 8 pages, 5 figures, under review

Via

Access Paper or Ask Questions

Direct Imitation Learning-based Visual Servoing using the Large Projection Formulation

Jun 13, 2024

Sayantan Auddy, Antonio Paolillo, Justus Piater, Matteo Saveriano

Figure 1 for Direct Imitation Learning-based Visual Servoing using the Large Projection Formulation

Figure 2 for Direct Imitation Learning-based Visual Servoing using the Large Projection Formulation

Figure 3 for Direct Imitation Learning-based Visual Servoing using the Large Projection Formulation

Figure 4 for Direct Imitation Learning-based Visual Servoing using the Large Projection Formulation

Abstract:Today robots must be safe, versatile, and user-friendly to operate in unstructured and human-populated environments. Dynamical system-based imitation learning enables robots to perform complex tasks stably and without explicit programming, greatly simplifying their real-world deployment. To exploit the full potential of these systems it is crucial to implement closed loops that use visual feedback. Vision permits to cope with environmental changes, but is complex to handle due to the high dimension of the image space. This study introduces a dynamical system-based imitation learning for direct visual servoing. It leverages off-the-shelf deep learning-based perception backbones to extract robust features from the raw input image, and an imitation learning strategy to execute sophisticated robot motions. The learning blocks are integrated using the large projection task priority formulation. As demonstrated through extensive experimental analysis, the proposed method realizes complex tasks with a robotic manipulator.

* First two authors contributed equally

Via

Access Paper or Ask Questions

Continual Domain Randomization

Mar 18, 2024

Josip Josifovski, Sayantan Auddy, Mohammadhossein Malmir, Justus Piater, Alois Knoll, Nicolás Navarro-Guerrero

Abstract:Domain Randomization (DR) is commonly used for sim2real transfer of reinforcement learning (RL) policies in robotics. Most DR approaches require a simulator with a fixed set of tunable parameters from the start of the training, from which the parameters are randomized simultaneously to train a robust model for use in the real world. However, the combined randomization of many parameters increases the task difficulty and might result in sub-optimal policies. To address this problem and to provide a more flexible training process, we propose Continual Domain Randomization (CDR) for RL that combines domain randomization with continual learning to enable sequential training in simulation on a subset of randomization parameters at a time. Starting from a model trained in a non-randomized simulation where the task is easier to solve, the model is trained on a sequence of randomizations, and continual learning is employed to remember the effects of previous randomizations. Our robotic reaching and grasping tasks experiments show that the model trained in this fashion learns effectively in simulation and performs robustly on the real robot while matching or outperforming baselines that employ combined randomization or sequential randomization without continual learning. Our code and videos are available at https://continual-dr.github.io/.

* Under peer review

Via

Access Paper or Ask Questions

Effect of Optimizer, Initializer, and Architecture of Hypernetworks on Continual Learning from Demonstration

Dec 31, 2023

Sayantan Auddy, Sebastian Bergner, Justus Piater

Abstract:In continual learning from demonstration (CLfD), a robot learns a sequence of real-world motion skills continually from human demonstrations. Recently, hypernetworks have been successful in solving this problem. In this paper, we perform an exploratory study of the effects of different optimizers, initializers, and network architectures on the continual learning performance of hypernetworks for CLfD. Our results show that adaptive learning rate optimizers work well, but initializers specially designed for hypernetworks offer no advantages for CLfD. We also show that hypernetworks that are capable of stable trajectory predictions are robust to different network architectures. Our open-source code is available at https://github.com/sebastianbergner/ExploringCLFD.

Via

Access Paper or Ask Questions

Scalable and Efficient Continual Learning from Demonstration via Hypernetwork-generated Stable Dynamics Model

Nov 06, 2023

Sayantan Auddy, Jakob Hollenstein, Matteo Saveriano, Antonio Rodríguez-Sánchez, Justus Piater

Abstract:Learning from demonstration (LfD) provides an efficient way to train robots. The learned motions should be convergent and stable, but to be truly effective in the real world, LfD-capable robots should also be able to remember multiple motion skills. Multi-skill retention is a capability missing from existing stable-LfD approaches. On the other hand, recent work on continual-LfD has shown that hypernetwork-generated neural ordinary differential equation solvers, can learn multiple LfD tasks sequentially, but this approach lacks stability guarantees. We propose an approach for stable continual-LfD in which a hypernetwork generates two networks: a trajectory learning dynamics model, and a trajectory stabilizing Lyapunov function. The introduction of stability not only generates stable trajectories but also greatly improves continual learning performance, especially in the size-efficient chunked hypernetworks. With our approach, we can continually train a single model to predict the position and orientation trajectories of the robot's end-effector simultaneously for multiple real world tasks without retraining on past demonstrations. We also propose stochastic regularization with a single randomly sampled regularization term in hypernetworks, which reduces the cumulative training time cost for $N$ tasks from $\mathcal{O}(N^2)$ to $\mathcal{O}(N)$ without any loss in performance in real-world tasks. We empirically evaluate our approach on the popular LASA dataset, on high-dimensional extensions of LASA (including up to 32 dimensions) to assess scalability, and on a novel extended robotic task dataset (RoboTasks9) to assess real-world performance. In trajectory error metrics, stability metrics and continual learning metrics our approach performs favorably, compared to other baselines. Code and datasets will be shared after submission.

* This paper is currently under internal review

Via

Access Paper or Ask Questions

GRINN: A Physics-Informed Neural Network for solving hydrodynamic systems in the presence of self-gravity

Aug 15, 2023

Sayantan Auddy, Ramit Dey, Neal J. Turner, Shantanu Basu

Abstract:Modeling self-gravitating gas flows is essential to answering many fundamental questions in astrophysics. This spans many topics including planet-forming disks, star-forming clouds, galaxy formation, and the development of large-scale structures in the Universe. However, the nonlinear interaction between gravity and fluid dynamics offers a formidable challenge to solving the resulting time-dependent partial differential equations (PDEs) in three dimensions (3D). By leveraging the universal approximation capabilities of a neural network within a mesh-free framework, physics informed neural networks (PINNs) offer a new way of addressing this challenge. We introduce the gravity-informed neural network (GRINN), a PINN-based code, to simulate 3D self-gravitating hydrodynamic systems. Here, we specifically study gravitational instability and wave propagation in an isothermal gas. Our results match a linear analytic solution to within 1\% in the linear regime and a conventional grid code solution to within 5\% as the disturbance grows into the nonlinear regime. We find that the computation time of the GRINN does not scale with the number of dimensions. This is in contrast to the scaling of the grid-based code for the hydrodynamic and self-gravity calculations as the number of dimensions is increased. Our results show that the GRINN computation time is longer than the grid code in one- and two- dimensional calculations but is an order of magnitude lesser than the grid code in 3D with similar accuracy. Physics-informed neural networks like GRINN thus show promise for advancing our ability to model 3D astrophysical flows.

Via

Access Paper or Ask Questions

Action Noise in Off-Policy Deep Reinforcement Learning: Impact on Exploration and Performance

Jun 08, 2022

Jakob Hollenstein, Sayantan Auddy, Matteo Saveriano, Erwan Renaudo, Justus Piater

Figure 1 for Action Noise in Off-Policy Deep Reinforcement Learning: Impact on Exploration and Performance

Figure 2 for Action Noise in Off-Policy Deep Reinforcement Learning: Impact on Exploration and Performance

Figure 3 for Action Noise in Off-Policy Deep Reinforcement Learning: Impact on Exploration and Performance

Figure 4 for Action Noise in Off-Policy Deep Reinforcement Learning: Impact on Exploration and Performance

Abstract:Many deep reinforcement learning algorithms rely on simple forms of exploration, such as the additive action-noise often used in continuous control domains. Typically, the scaling factor of this action noise is chosen as a hyper-parameter and kept constant during training. In this paper, we analyze how the learned policy is impacted by the noise type, scale, and reducing of the scaling factor over time. We consider the two most prominent types of action-noise: Gaussian and Ornstein-Uhlenbeck noise, and perform a vast experimental campaign by systematically varying the noise type and scale parameter, and by measuring variables of interest like the expected return of the policy and the state space coverage during exploration. For the latter, we propose a novel state-space coverage measure $\operatorname{X}_{\mathcal{U}\text{rel}}$ that is more robust to boundary artifacts than previously proposed measures. Larger noise scales generally increase state space coverage. However, we found that increasing the space coverage using a larger noise scale is often not beneficial. On the contrary, reducing the noise-scale over the training process reduces the variance and generally improves the learning performance. We conclude that the best noise-type and scale are environment dependent, and based on our observations, derive heuristic rules for guiding the choice of the action noise as a starting point for further optimization.

Via

Access Paper or Ask Questions

Using Bayesian Deep Learning to infer Planet Mass from Gaps in Protoplanetary Disks

Feb 23, 2022

Sayantan Auddy, Ramit Dey, Min-Kai Lin, Daniel Carrera, Jacob B. Simon

Figure 1 for Using Bayesian Deep Learning to infer Planet Mass from Gaps in Protoplanetary Disks

Figure 2 for Using Bayesian Deep Learning to infer Planet Mass from Gaps in Protoplanetary Disks

Figure 3 for Using Bayesian Deep Learning to infer Planet Mass from Gaps in Protoplanetary Disks

Figure 4 for Using Bayesian Deep Learning to infer Planet Mass from Gaps in Protoplanetary Disks

Abstract:Planet induced sub-structures, like annular gaps, observed in dust emission from protoplanetary disks provide a unique probe to characterize unseen young planets. While deep learning based model has an edge in characterizing the planet's properties over traditional methods, like customized simulations and empirical relations, it lacks in its ability to quantify the uncertainty associated with its predictions. In this paper, we introduce a Bayesian deep learning network "DPNNet-Bayesian" that can predict planet mass from disk gaps and provides uncertainties associated with the prediction. A unique feature of our approach is that it can distinguish between the uncertainty associated with the deep learning architecture and uncertainty inherent in the input data due to measurement noise. The model is trained on a data set generated from disk-planet simulations using the \textsc{fargo3d} hydrodynamics code with a newly implemented fixed grain size module and improved initial conditions. The Bayesian framework enables estimating a gauge/confidence interval over the validity of the prediction when applied to unknown observations. As a proof-of-concept, we apply DPNNet-Bayesian to dust gaps observed in HL Tau. The network predicts masses of $ 86.0 \pm 5.5 M_{\Earth} $, $ 43.8 \pm 3.3 M_{\Earth} $, and $ 92.2 \pm 5.1 M_{\Earth} $ respectively, which are comparable to other studies based on specialized simulations.

* 14 pages, 6 figures, submitted to ApJ

Via

Access Paper or Ask Questions

Continual Learning from Demonstration of Robotic Skills

Feb 15, 2022

Sayantan Auddy, Jakob Hollenstein, Matteo Saveriano, Antonio Rodríguez-Sánchez, Justus Piater

Figure 1 for Continual Learning from Demonstration of Robotic Skills

Figure 2 for Continual Learning from Demonstration of Robotic Skills

Figure 3 for Continual Learning from Demonstration of Robotic Skills

Figure 4 for Continual Learning from Demonstration of Robotic Skills

Abstract:Methods for teaching motion skills to robots focus on training for a single skill at a time. Robots capable of learning from demonstration can considerably benefit from the added ability to learn new movements without forgetting past knowledge. To this end, we propose an approach for continual learning from demonstration using hypernetworks and neural ordinary differential equation solvers. We empirically demonstrate the effectiveness of our approach in remembering long sequences of trajectory learning tasks without the need to store any data from past demonstrations. Our results show that hypernetworks outperform other state-of-the-art regularization-based continual learning approaches for learning from demonstration. In our experiments, we use the popular LASA trajectory benchmark, and a new dataset of kinesthetic demonstrations that we introduce in this paper called the HelloWorld dataset. We evaluate our approach using both trajectory error metrics and continual learning metrics, and we propose two new continual learning metrics. Our code, along with the newly collected dataset, is available at https://github.com/sayantanauddy/clfd.

* This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

Via

Access Paper or Ask Questions

DPNNet-2.0 Part I: Finding hidden planets from simulated images of protoplanetary disk gaps

Jul 19, 2021

Sayantan Auddy, Ramit Dey, Min-Kai Lin, Cassandra Hall

Figure 1 for DPNNet-2.0 Part I: Finding hidden planets from simulated images of protoplanetary disk gaps

Figure 2 for DPNNet-2.0 Part I: Finding hidden planets from simulated images of protoplanetary disk gaps

Figure 3 for DPNNet-2.0 Part I: Finding hidden planets from simulated images of protoplanetary disk gaps

Figure 4 for DPNNet-2.0 Part I: Finding hidden planets from simulated images of protoplanetary disk gaps

Abstract:The observed sub-structures, like annular gaps, in dust emissions from protoplanetary disk, are often interpreted as signatures of embedded planets. Fitting a model of planetary gaps to these observed features using customized simulations or empirical relations can reveal the characteristics of the hidden planets. However, customized fitting is often impractical owing to the increasing sample size and the complexity of disk-planet interaction. In this paper we introduce the architecture of DPNNet-2.0, second in the series after DPNNet \citep{aud20}, designed using a Convolutional Neural Network ( CNN, here specifically ResNet50) for predicting exoplanet masses directly from simulated images of protoplanetary disks hosting a single planet. DPNNet-2.0 additionally consists of a multi-input framework that uses both a CNN and multi-layer perceptron (a class of artificial neural network) for processing image and disk parameters simultaneously. This enables DPNNet-2.0 to be trained using images directly, with the added option of considering disk parameters (disk viscosities, disk temperatures, disk surface density profiles, dust abundances, and particle Stokes numbers) generated from disk-planet hydrodynamic simulations as inputs. This work provides the required framework and is the first step towards the use of computer vision (implementing CNN) to directly extract mass of an exoplanet from planetary gaps observed in dust-surface density maps by telescopes such as the Atacama Large (sub-)Millimeter Array.

* 15 pages, 10 figures, to appear in ApJ

Via

Access Paper or Ask Questions