KTH Royal Institute of Technology, Stockholm, Sweden
Abstract:Informative path planning (IPP) applied to bathymetric mapping allows AUVs to focus on feature-rich areas to quickly reduce uncertainty and increase mapping efficiency. Existing methods based on Bayesian optimization (BO) over Gaussian Process (GP) maps work well on small scenarios but they are short-sighted and computationally heavy when mapping larger areas, hindering deployment in real applications. To overcome this, we present a 2-layered BO IPP method that performs non-myopic, real-time planning in a tree search fashion over large Stochastic Variational GP maps, while respecting the AUV motion constraints and accounting for localization uncertainty. Our framework outperforms the standard industrial lawn-mowing pattern and a myopic baseline in a set of hardware in the loop (HIL) experiments in an embedded platform over real bathymetry.
Abstract:Inherent robustness in manipulation is prevalent in biological systems and critical for robotic manipulation systems due to real-world uncertainties and disturbances. This robustness relies not only on robust control policies but also on the design characteristics of the end-effectors. This paper introduces a bi-level optimization approach to co-designing tools and control policies to achieve robust manipulation. The approach employs reinforcement learning for lower-level control policy learning and multi-task Bayesian optimization for upper-level design optimization. Diverging from prior approaches, we incorporate caging-based robustness metrics into both levels, ensuring manipulation robustness against disturbances and environmental variations. Our method is evaluated in four non-prehensile manipulation environments, demonstrating improvements in task success rate under disturbances and environment changes. A real-world experiment is also conducted to validate the framework's practical effectiveness.
Abstract:Reachable set computation is an important tool for analyzing control systems. Simulating a control system can show that the system is generally functioning as desired, but a formal tool like reachability analysis can provide a guarantee of correctness. For linear systems, reachability analysis is straightforward and fast, but as more complex components are added to the control system such as nonlinear dynamics or a neural network controller, reachability analysis may slow down or become overly conservative. To address these challenges, much literature has focused on spatial refinement, e.g., tuning the discretization of the input sets and intermediate reachable sets. However, this paper addresses a different dimension: temporal refinement. The basic idea of temporal refinement is to automatically choose when along the horizon of the reachability problem to execute slow symbolic queries which incur less approximation error versus fast concrete queries which incur more approximation error. Temporal refinement can be combined with other refinement approaches and offers an additional ``tuning knob'' with which to trade off tractability and tightness in approximate reachable set computation. Here, we introduce an automatic framework for performing temporal refinement and we demonstrate the effectiveness of this technique on computing approximate reachable sets for nonlinear systems with neural network control policies. We demonstrate the calculation of reachable sets of varying approximation error under varying computational budget and show that our algorithm is able to generate approximate reachable sets with a similar amount of error to the baseline approach in 20-70% less time.
Abstract:Useful robot control algorithms should not only achieve performance objectives but also adhere to hard safety constraints. Control Barrier Functions (CBFs) have been developed to provably ensure system safety through forward invariance. However, they often unnecessarily sacrifice performance for safety since they are purely reactive. Receding horizon control (RHC), on the other hand, consider planned trajectories to account for the future evolution of a system. This work provides a new perspective on safety-critical control by introducing Forward Invariance in Trajectory Spaces (FITS). We lift the problem of safe RHC into the trajectory space and describe the evolution of planned trajectories as a controlled dynamical system. Safety constraints defined over states can be converted into sets in the trajectory space which we render forward invariant via a CBF framework. We derive an efficient quadratic program (QP) to synthesize trajectories that provably satisfy safety constraints. Our experiments support that FITS improves the adherence to safety specifications without sacrificing performance over alternative CBF and NMPC methods.
Abstract:This work addresses maximally robust control synthesis under unknown disturbances. We consider a general nonlinear system, subject to a Signal Temporal Logic (STL) specification, and wish to jointly synthesize the maximal possible disturbance bounds and the corresponding controllers that ensure the STL specification is satisfied under these bounds. Many works have considered STL satisfaction under given bounded disturbances. Yet, to the authors' best knowledge, this is the first work that aims to maximize the permissible disturbance set and find the corresponding controllers that ensure satisfying the STL specification with maximum disturbance robustness. We extend the notion of disturbance-robust semantics for STL, which is a property of a specification, dynamical system, and controller, and provide an algorithm to get the maximal disturbance robust controllers satisfying an STL specification using Hamilton-Jacobi reachability. We show its soundness and provide a simulation example with an Autonomous Underwater Vehicle (AUV).
Abstract:One of the main challenges in multi-agent reinforcement learning is scalability as the number of agents increases. This issue is further exacerbated if the problem considered is temporally dependent. State-of-the-art solutions today mainly follow centralized training with decentralized execution paradigm in order to handle the scalability concerns. In this paper, we propose time-dependent multi-agent transformers which can solve the temporally dependent multi-agent problem efficiently with a centralized approach via the use of transformers that proficiently handle the large input. We highlight the efficacy of this method on two problems and use tools from statistics to verify the probability that the trajectories generated under the policy satisfy the task. The experiments show that our approach has superior performance against the literature baseline algorithms in both cases.
Abstract:In environments like offices, the duration of a robot's navigation between two locations may vary over time. For instance, reaching a kitchen may take more time during lunchtime since the corridors are crowded with people heading the same way. In this work, we address the problem of routing in such environments with tasks expressed in Metric Interval Temporal Logic (MITL) - a rich robot task specification language that allows us to capture explicit time requirements. Our objective is to find a strategy that maximizes the temporal robustness of the robot's MITL task. As the first step towards a solution, we define a Mixed-integer linear programming approach to solving the task planning problem over a Varying Weighted Transition System, where navigation durations are deterministic but vary depending on the time of day. Then, we apply this planner to optimize for MITL temporal robustness in Markov Decision Processes, where the navigation durations between physical locations are uncertain, but the time-dependent distribution over possible delays is known. Finally, we develop a receding horizon planner for Markov Decision Processes that preserves guarantees over MITL temporal robustness. We show the scalability of our planning algorithms in simulations of robotic tasks.
Abstract:Autonomous agents rely on sensor data to construct representations of their environment, essential for predicting future events and planning their own actions. However, sensor measurements suffer from limited range, occlusions, and sensor noise. These challenges become more evident in dynamic environments, where efficiently inferring the state of the environment based on sensor readings from different times is still an open problem. This work focuses on inferring the state of the dynamic part of the environment, i.e., where dynamic objects might be, based on previous observations and constraints on their dynamics. We formalize the problem and introduce Transitional Grid Maps (TGMs), an efficient analytical solution. TGMs are based on a set of novel assumptions that hold in many practical scenarios. They significantly reduce the complexity of the problem, enabling continuous prediction and updating of the entire dynamic map based on the known static map (see Fig.1), differentiating them from other alternatives. We compare our approach with a state-of-the-art particle filter, obtaining more prudent predictions in occluded scenarios and on-par results on unoccluded tracking.
Abstract:Partial observability and uncertainty are common problems in sequential decision-making that particularly impede the use of formal models such as Markov decision processes (MDPs). However, in practice, agents may be able to employ costly sensors to measure their environment and resolve partial observability by gathering information. Moreover, imprecise transition functions can capture model uncertainty. We combine these concepts and extend MDPs to robust active-measuring MDPs (RAM-MDPs). We present an active-measure heuristic to solve RAM-MDPs efficiently and show that model uncertainty can, counterintuitively, let agents take fewer measurements. We propose a method to counteract this behavior while only incurring a bounded additional cost. We empirically compare our methods to several baselines and show their superior scalability and performance.
Abstract:Signal Temporal Logic (STL) is a formal language over continuous-time signals (such as trajectories of a multi-agent system) that allows for the specification of complex spatial and temporal system requirements (such as staying sufficiently close to each other within certain time intervals). To promote robustness in multi-agent motion planning with such complex requirements, we consider motion planning with the goal of maximizing the temporal robustness of their joint STL specification, i.e. maximizing the permissible time shifts of each agent's trajectory while still satisfying the STL specification. Previous methods presented temporally robust motion planning and control in a discrete-time Mixed Integer Linear Programming (MILP) optimization scheme. In contrast, we parameterize the trajectory by continuous B\'ezier curves, where the curvature and the time-traversal of the trajectory are parameterized individually. We show an algorithm generating continuous-time temporally robust trajectories and prove soundness of our approach. Moreover, we empirically show that our parametrization realizes this with a considerable speed-up compared to state-of-the-art methods based on constant interval time discretization.