Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Joe Watson

Towards Safe Robot Foundation Models Using Inductive Biases

May 15, 2025

Maximilian Tölle, Theo Gruner, Daniel Palenicek, Tim Schneider, Jonas Günster, Joe Watson, Davide Tateo, Puze Liu, Jan Peters

Abstract:Safety is a critical requirement for the real-world deployment of robotic systems. Unfortunately, while current robot foundation models show promising generalization capabilities across a wide variety of tasks, they fail to address safety, an important aspect for ensuring long-term operation. Current robot foundation models assume that safe behavior should emerge by learning from a sufficiently large dataset of demonstrations. However, this approach has two clear major drawbacks. Firstly, there are no formal safety guarantees for a behavior cloning policy trained using supervised learning. Secondly, without explicit knowledge of any safety constraints, the policy may require an unreasonable number of additional demonstrations to even approximate the desired constrained behavior. To solve these key issues, we show how we can instead combine robot foundation models with geometric inductive biases using ATACOM, a safety layer placed after the foundation policy that ensures safe state transitions by enforcing action constraints. With this approach, we can ensure formal safety guarantees for generalist policies without providing extensive demonstrations of safe behavior, and without requiring any specific fine-tuning for safety. Our experiments show that our approach can be beneficial both for classical manipulation tasks, where we avoid unwanted collisions with irrelevant objects, and for dynamic tasks, such as the robot air hockey environment, where we can generate fast trajectories respecting complex tasks and joint space constraints.

* 14 pages, 5 figures

Via

Access Paper or Ask Questions

Towards Safe Robot Foundation Models

Mar 10, 2025

Maximilian Tölle, Theo Gruner, Daniel Palenicek, Jonas Günster, Puze Liu, Joe Watson, Davide Tateo, Jan Peters

Abstract:Robot foundation models hold the potential for deployment across diverse environments, from industrial applications to household tasks. While current research focuses primarily on the policies' generalization capabilities across a variety of tasks, it fails to address safety, a critical requirement for deployment on real-world systems. In this paper, we introduce a safety layer designed to constrain the action space of any generalist policy appropriately. Our approach uses ATACOM, a safe reinforcement learning algorithm that creates a safe action space and, therefore, ensures safe state transitions. By extending ATACOM to generalist policies, our method facilitates their deployment in safety-critical scenarios without requiring any specific safety fine-tuning. We demonstrate the effectiveness of this safety layer in an air hockey environment, where it prevents a puck-hitting agent from colliding with its surroundings, a failure observed in generalist policies.

Via

Access Paper or Ask Questions

Global Tensor Motion Planning

Nov 28, 2024

An T. Le, Kay Hansel, João Carvalho, Joe Watson, Julen Urain, Armin Biess, Georgia Chalvatzaki, Jan Peters

Abstract:Batch planning is increasingly crucial for the scalability of robotics tasks and dataset generation diversity. This paper presents Global Tensor Motion Planning (GTMP) -- a sampling-based motion planning algorithm comprising only tensor operations. We introduce a novel discretization structure represented as a random multipartite graph, enabling efficient vectorized sampling, collision checking, and search. We provide an early theoretical investigation showing that GTMP exhibits probabilistic completeness while supporting modern GPU/TPU. Additionally, by incorporating smooth structures into the multipartite graph, GTMP directly plans smooth splines without requiring gradient-based optimization. Experiments on lidar-scanned occupancy maps and the MotionBenchMarker dataset demonstrate GTMP's computation efficiency in batch planning compared to baselines, underscoring GTMP's potential as a robust, scalable planner for diverse applications and large-scale robot learning tasks.

* 8 pages, 4 figures

Via

Access Paper or Ask Questions

Machine Learning with Physics Knowledge for Prediction: A Survey

Aug 19, 2024

Joe Watson, Chen Song, Oliver Weeger, Theo Gruner, An T. Le, Kay Hansel, Ahmed Hendawy, Oleg Arenz, Will Trojak, Miles Cranmer(+5 more)

Figure 1 for Machine Learning with Physics Knowledge for Prediction: A Survey

Figure 2 for Machine Learning with Physics Knowledge for Prediction: A Survey

Figure 3 for Machine Learning with Physics Knowledge for Prediction: A Survey

Figure 4 for Machine Learning with Physics Knowledge for Prediction: A Survey

Abstract:This survey examines the broad suite of methods and models for combining machine learning with physics knowledge for prediction and forecast, with a focus on partial differential equations. These methods have attracted significant interest due to their potential impact on advancing scientific research and industrial practices by improving predictive models with small- or large-scale datasets and expressive predictive models with useful inductive biases. The survey has two parts. The first considers incorporating physics knowledge on an architectural level through objective functions, structured predictive models, and data augmentation. The second considers data as physics knowledge, which motivates looking at multi-task, meta, and contextual learning as an alternative approach to incorporating physics knowledge in a data-driven fashion. Finally, we also provide an industrial perspective on the application of these methods and a survey of the open-source ecosystem for physics-informed machine learning.

* 56 pages, 8 figures, 2 tables

Via

Access Paper or Ask Questions

Function-Space Regularization for Deep Bayesian Classification

Jul 12, 2023

Jihao Andreas Lin, Joe Watson, Pascal Klink, Jan Peters

Figure 1 for Function-Space Regularization for Deep Bayesian Classification

Figure 2 for Function-Space Regularization for Deep Bayesian Classification

Figure 3 for Function-Space Regularization for Deep Bayesian Classification

Figure 4 for Function-Space Regularization for Deep Bayesian Classification

Abstract:Bayesian deep learning approaches assume model parameters to be latent random variables and infer posterior distributions to quantify uncertainty, increase safety and trust, and prevent overconfident and unpredictable behavior. However, weight-space priors are model-specific, can be difficult to interpret and are hard to specify. Instead, we apply a Dirichlet prior in predictive space and perform approximate function-space variational inference. To this end, we interpret conventional categorical predictions from stochastic neural network classifiers as samples from an implicit Dirichlet distribution. By adapting the inference, the same function-space prior can be combined with different models without affecting model architecture or size. We illustrate the flexibility and efficacy of such a prior with toy experiments and demonstrate scalability, improved uncertainty quantification and adversarial robustness with large-scale image classification experiments.

* Advances in Approximate Bayesian Inference 2023

Via

Access Paper or Ask Questions

Coherent Soft Imitation Learning

May 29, 2023

Joe Watson, Sandy H. Huang, Nicolas Heess

Abstract:Imitation learning methods seek to learn from an expert either through behavioral cloning (BC) of the policy or inverse reinforcement learning (IRL) of the reward. Such methods enable agents to learn complex tasks from humans that are difficult to capture with hand-designed reward functions. Choosing BC or IRL for imitation depends on the quality and state-action coverage of the demonstrations, as well as additional access to the Markov decision process. Hybrid strategies that combine BC and IRL are not common, as initial policy optimization against inaccurate rewards diminishes the benefit of pretraining the policy with BC. This work derives an imitation method that captures the strengths of both BC and IRL. In the entropy-regularized ('soft') reinforcement learning setting, we show that the behaviour-cloned policy can be used as both a shaped reward and a critic hypothesis space by inverting the regularized policy update. This coherency facilities fine-tuning cloned policies using the reward estimate and additional interactions with the environment. This approach conveniently achieves imitation learning through initial behaviour cloning, followed by refinement via RL with online or offline data sources. The simplicity of the approach enables graceful scaling to high-dimensional and vision-based tasks, with stable learning and minimal hyperparameter tuning, in contrast to adversarial approaches.

* 51 pages, 47 figures. DeepMind internship report

Via

Access Paper or Ask Questions

Inferring Smooth Control: Monte Carlo Posterior Policy Iteration with Gaussian Processes

Oct 07, 2022

Joe Watson, Jan Peters

Figure 1 for Inferring Smooth Control: Monte Carlo Posterior Policy Iteration with Gaussian Processes

Figure 2 for Inferring Smooth Control: Monte Carlo Posterior Policy Iteration with Gaussian Processes

Figure 3 for Inferring Smooth Control: Monte Carlo Posterior Policy Iteration with Gaussian Processes

Figure 4 for Inferring Smooth Control: Monte Carlo Posterior Policy Iteration with Gaussian Processes

Abstract:Monte Carlo methods have become increasingly relevant for control of non-differentiable systems, approximate dynamics models and learning from data. These methods scale to high-dimensional spaces and are effective at the non-convex optimizations often seen in robot learning. We look at sample-based methods from the perspective of inference-based control, specifically posterior policy iteration. From this perspective, we highlight how Gaussian noise priors produce rough control actions that are unsuitable for physical robot deployment. Considering smoother Gaussian process priors, as used in episodic reinforcement learning and motion planning, we demonstrate how smoother model predictive control can be achieved using online sequential inference. This inference is realized through an efficient factorization of the action distribution and a novel means of optimizing the likelihood temperature to improve importance sampling accuracy. We evaluate this approach on several high-dimensional robot control tasks, matching the sample efficiency of prior heuristic methods while also ensuring smoothness. Simulation results can be seen at https://monte-carlo-ppi.github.io/.

* 43 pages, 37 figures. Conference on Robot Learning 2022

Via

Access Paper or Ask Questions

A Differentiable Newton-Euler Algorithm for Real-World Robotics

Oct 24, 2021

Michael Lutter, Johannes Silberbauer, Joe Watson, Jan Peters

Figure 1 for A Differentiable Newton-Euler Algorithm for Real-World Robotics

Figure 2 for A Differentiable Newton-Euler Algorithm for Real-World Robotics

Figure 3 for A Differentiable Newton-Euler Algorithm for Real-World Robotics

Figure 4 for A Differentiable Newton-Euler Algorithm for Real-World Robotics

Abstract:Obtaining dynamics models is essential for robotics to achieve accurate model-based controllers and simulators for planning. The dynamics models are typically obtained using model specification of the manufacturer or simple numerical methods such as linear regression. However, this approach does not guarantee physically plausible parameters and can only be applied to kinematic chains consisting of rigid bodies. In this article, we describe a differentiable simulator that can be used to identify the system parameters of real-world mechanical systems with complex friction models, holonomic as well as non-holonomic constraints. To guarantee physically consistent parameters, we utilize virtual parameters and gradient-based optimization. The described Differentiable Newton-Euler Algorithm (DiffNEA) can be applied to a class of dynamical systems and guarantees physically plausible predictions. The extensive experimental evaluation shows, that the proposed model learning approach learns accurate dynamics models of systems with complex friction and non-holonomic constraints. Especially in the offline reinforcement learning experiments, the identified DiffNEA models excel. For the challenging ball in a cup task, these models solve the task using model-based offline reinforcement learning on the physical system. The black-box baselines fail on this task in simulation and on the physical system despite using more data for learning the model.

* arXiv admin note: text overlap with arXiv:2011.01734

Via

Access Paper or Ask Questions

A Robot Cluster for Reproducible Research in Dexterous Manipulation

Sep 22, 2021

Stefan Bauer, Felix Widmaier, Manuel Wüthrich, Niklas Funk, Julen Urain De Jesus, Jan Peters, Joe Watson, Claire Chen, Krishnan Srinivasan, Junwu Zhang(+19 more)

Figure 1 for A Robot Cluster for Reproducible Research in Dexterous Manipulation

Figure 2 for A Robot Cluster for Reproducible Research in Dexterous Manipulation

Figure 3 for A Robot Cluster for Reproducible Research in Dexterous Manipulation

Figure 4 for A Robot Cluster for Reproducible Research in Dexterous Manipulation

Abstract:Dexterous manipulation remains an open problem in robotics. To coordinate efforts of the research community towards tackling this problem, we propose a shared benchmark. We designed and built robotic platforms that are hosted at the MPI-IS and can be accessed remotely. Each platform consists of three robotic fingers that are capable of dexterous object manipulation. Users are able to control the platforms remotely by submitting code that is executed automatically, akin to a computational cluster. Using this setup, i) we host robotics competitions, where teams from anywhere in the world access our platforms to tackle challenging tasks, ii) we publish the datasets collected during these competitions (consisting of hundreds of robot hours), and iii) we give researchers access to these platforms for their own projects.

Via

Access Paper or Ask Questions

Stochastic Control through Approximate Bayesian Input Inference

May 17, 2021

Joe Watson, Hany Abdulsamad, Rolf Findeisen, Jan Peters

Figure 1 for Stochastic Control through Approximate Bayesian Input Inference

Figure 2 for Stochastic Control through Approximate Bayesian Input Inference

Figure 3 for Stochastic Control through Approximate Bayesian Input Inference

Figure 4 for Stochastic Control through Approximate Bayesian Input Inference

Abstract:Optimal control under uncertainty is a prevailing challenge in control, due to the difficulty in producing tractable solutions for the stochastic optimization problem. By framing the control problem as one of input estimation, advanced approximate inference techniques can be used to handle the statistical approximations in a principled and practical manner. Analyzing the Gaussian setting, we present a solver capable of several stochastic control methods, and was found to be superior to popular baselines on nonlinear simulated tasks. We draw connections that relate this inference formulation to previous approaches for stochastic optimal control, and outline several advantages that this inference view brings due to its statistical nature.

* Submitted to Transactions on Automatic Control Special Issue: Learning and Control. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

Via

Access Paper or Ask Questions