Abstract:In this work we introduce the problem of task assistance planning where we are given two robots Rtask and Rassist. The first robot, Rtask, is in charge of performing a given task by executing a precomputed path. The second robot, Rassist, is in charge of assisting the task performed by Rtask using on-board sensors. The ability of Rassist to provide assistance to Rtask depends on the locations of both robots. Since Rtask is moving along its path, Rassist may also need to move to provide as much assistance as possible. The problem we study is how to compute a path for Rassist so as to maximize the portion of Rtask's path for which assistance is provided. We limit the problem to the setting where Rassist moves on a roadmap which is a graph embedded in its configuration space and show that this problem is NP-hard. Fortunately, we show that when Rassist moves on a given path, and all we have to do is compute the times at which Rassist should move from one configuration to the following one, we can solve the problem optimally in polynomial time. Together with carefully-crafted upper bounds, this polynomial-time algorithm is integrated into a Branch and Bound-based algorithm that can compute optimal solutions to the problem outperforming baselines by several orders of magnitude. We demonstrate our work empirically in simulated scenarios containing both planar manipulators and UR robots as well as in the lab on real robots.
Abstract:Tendon-driven continuum robots (TDCRs), with their flexible backbones, offer the advantage of being used for navigating complex, cluttered environments. However, to do so, they typically require multiple segments, often leading to complex actuation and control challenges. To this end, we propose a novel approach to navigate cluttered spaces effectively for a single-segment long TDCR which is the simplest topology from a mechanical point of view. Our key insight is that by leveraging contact with the environment we can achieve multiple curvatures without mechanical alterations to the robot. Specifically, we propose a search-based motion planner for a single-segment TDCR. This planner, guided by a specially designed heuristic, discretizes the configuration space and employs a best-first search. The heuristic, crucial for efficient navigation, provides an effective cost-to-go estimation while respecting the kinematic constraints of the TDCR and environmental interactions. We empirically demonstrate the efficiency of our planner-testing over 525 queries in environments with both convex and non-convex obstacles, our planner is demonstrated to have a success rate of about 80% while baselines were not able to obtain a success rate higher than 30%. The difference is attributed to our novel heuristic which is shown to significantly reduce the required search space.
Abstract:Autonomous inspection tasks necessitate effective path-planning mechanisms to efficiently gather observations from points of interest (POI). However, localization errors commonly encountered in urban environments can introduce execution uncertainty, posing challenges to the successful completion of such tasks. To tackle these challenges, we present IRIS-under uncertainty (IRIS-U^2), an extension of the incremental random inspection-roadmap search (IRIS) algorithm, that addresses the offline planning problem via an A*-based approach, where the planning process occurs prior the online execution. The key insight behind IRIS-U^2 is transforming the computed localization uncertainty, obtained through Monte Carlo (MC) sampling, into a POI probability. IRIS-U^2 offers insights into the expected performance of the execution task by providing confidence intervals (CI) for the expected coverage, expected path length, and collision probability, which becomes progressively tighter as the number of MC samples increase. The efficacy of IRIS-U^2 is demonstrated through a case study focusing on structural inspections of bridges. Our approach exhibits improved expected coverage, reduced collision probability, and yields increasingly-precise CIs as the number of MC samples grows. Furthermore, we emphasize the potential advantages of computing bounded sub-optimal solutions to reduce computation time while still maintaining the same CI boundaries.
Abstract:We consider a Multi-Agent Path Finding (MAPF) setting where agents have been assigned a plan, but during its execution some agents are delayed. Instead of replanning from scratch when such a delay occurs, we propose delay introduction, whereby we delay some additional agents so that the remainder of the plan can be executed safely. We show that the corresponding decision problem is NP-Complete in general. However, in practice we can find optimal delay-introductions using CBS for very large numbers of agents, and both planning time and the resulting length of the plan are comparable, and sometimes outperform, the state-of-the-art heuristics for replanning. We also examine the benefits of our method from an explainability point of view.
Abstract:In automated warehouses, teams of mobile robots fulfill the packaging process by transferring inventory pods to designated workstations while navigating narrow aisles formed by tightly packed pods. This problem is typically modeled as a Multi-Agent Pickup and Delivery (MAPD) problem, which is then solved by repeatedly planning collision-free paths for agents on a fixed graph, as in the Rolling-Horizon Collision Resolution (RHCR) algorithm. However, existing approaches make the limiting assumption that agents are only allowed to move pods that correspond to their current task, while considering the other pods as stationary obstacles (even though all pods are movable). This behavior can result in unnecessarily long paths which could otherwise be avoided by opening additional corridors via pod manipulation. To this end, we explore the implications of allowing agents the flexibility of dynamically relocating pods. We call this new problem Terraforming MAPD (tMAPD) and develop an RHCR-based approach to tackle it. As the extra flexibility of terraforming comes at a significant computational cost, we utilize this capability judiciously by identifying situations where it could make a significant impact on the solution quality. In particular, we invoke terraforming in response to disruptions that often occur in automated warehouses, e.g., when an item is dropped from a pod or when agents malfunction. Empirically, using our approach for tMAPD, where disruptions are modeled via a stochastic process, we improve throughput by over 10%, reduce the maximum service time (the difference between the drop-off time and the pickup time of a pod) by more than 50%, without drastically increasing the runtime, compared to the MAPD setting.
Abstract:Offline reinforcement-learning (RL) algorithms learn to make decisions using a given, fixed training dataset without the possibility of additional online data collection. This problem setting is captivating because it holds the promise of utilizing previously collected datasets without any costly or risky interaction with the environment. However, this promise also bears the drawback of this setting. The restricted dataset induces subjective uncertainty because the agent can encounter unfamiliar sequences of states and actions that the training data did not cover. Moreover, inherent system stochasticity further increases uncertainty and aggravates the offline RL problem, preventing the agent from learning an optimal policy. To mitigate the destructive uncertainty effects, we need to balance the aspiration to take reward-maximizing actions with the incurred risk due to incorrect ones. In financial economics, modern portfolio theory (MPT) is a method that risk-averse investors can use to construct diversified portfolios that maximize their returns without unacceptable levels of risk. We integrate MPT into the agent's decision-making process to present a simple-yet-highly-effective risk-aware planning algorithm for offline RL. Our algorithm allows us to systematically account for the \emph{estimated quality} of specific actions and their \emph{estimated risk} due to the uncertainty. We show that our approach can be coupled with the Transformer architecture to yield a state-of-the-art planner for offline RL tasks, maximizing the return while significantly reducing the variance.
Abstract:We consider the problem of finding collision-free paths for curvature-constrained systems in the presence of obstacles while minimizing execution time. Specifically, we focus on the setting where a planar system can travel at some range of speeds with unbounded acceleration. This setting can model many systems, such as fixed-wing drones. Unfortunately, planning for such systems might require evaluating many (local) time-optimal transitions connecting two close-by configurations, which is computationally expensive. Existing methods either pre-compute all such transitions in a preprocessing stage or use heuristics to speed up the search, thus foregoing any guarantees on solution quality. Our key insight is that computing all the time-optimal transitions is both~(i)~computationally expensive and~(ii)~unnecessary for many problem instances. We show that by finding bounded-suboptimal solutions (solutions whose cost is bounded by $1+\varepsilon$ times the cost of the optimal solution for any user-provided $\varepsilon$) and not time-optimal solutions, one can dramatically reduce the number of time-optimal transitions used. We demonstrate using empirical evaluation that our planning framework can reduce the runtime by several orders of magnitude compared to the state-of-the-art while still providing guarantees on the quality of the solution.
Abstract:Multi-agent pathfinding (MAPF) is concerned with planning collision-free paths for a team of agents from their start to goal locations in an environment cluttered with obstacles. Typical approaches for MAPF consider the locations of obstacles as being fixed, which limits their effectiveness in automated warehouses, where obstacles (representing pods or shelves) can be moved out of the way by agents (representing robots) to relieve bottlenecks and introduce shorter routes. In this work we initiate the study of MAPF with movable obstacles. In particular, we introduce a new extension of MAPF, which we call Terraforming MAPF (tMAPF), where some agents are responsible for moving obstacles to clear the way for other agents. Solving tMAPF is extremely challenging as it requires reasoning not only about collisions between agents, but also where and when obstacles should be moved. We present extensions of two state-of-the-art algorithms, CBS and PBS, in order to tackle tMAPF, and demonstrate that they can consistently outperform the best solution possible under a static-obstacle setting.
Abstract:A central challenge in building robotic prostheses is the creation of a sensor-based system able to read physiological signals from the lower limb and instruct a robotic hand to perform various tasks. Existing systems typically perform discrete gestures such as pointing or grasping, by employing electromyography (EMG) or ultrasound (US) technologies to analyze the state of the muscles. In this work, we study the inference problem of identifying the activation of specific fingers from a sequence of US images when performing dexterous tasks such as keyboard typing or playing the piano. While estimating finger gestures has been done in the past by detecting prominent gestures, we are interested in classification done in the context of fine motions that evolve over time. We consider this task as an important step towards higher adoption rates of robotic prostheses among arm amputees, as it has the potential to dramatically increase functionality in performing daily tasks. Our key observation, motivating this work, is that modeling the hand as a robotic manipulator allows to encode an intermediate representation wherein US images are mapped to said configurations. Given a sequence of such learned configurations, coupled with a neural-network architecture that exploits temporal coherence, we are able to infer fine finger motions. We evaluated our method by collecting data from a group of subjects and demonstrating how our framework can be used to replay music played or text typed. To the best of our knowledge, this is the first study demonstrating these downstream tasks within an end-to-end system.
Abstract:In Lifelong Multi-Agent Path Finding (L-MAPF) a team of agents performs a stream of tasks consisting of multiple locations to be visited by the agents on a shared graph while avoiding collisions with one another. L-MAPF is typically tackled by partitioning it into multiple consecutive, and hence similar, "one-shot" MAPF queries with a single task assigned to each agent, as in the Rolling-Horizon Collision Resolution (RHCR) algorithm. Thus, a solution to one query informs the next query, which leads to similarity with respect to the agents' start and goal positions, and how collisions need to be resolved from one query to the next. Thus, experience from solving one MAPF query can potentially be used to speedup solving the next one. Despite this intuition, current L-MAPF planners solve consecutive MAPF queries from scratch. In this paper, we introduce a new RHCR-inspired approach called exRHCR, which exploits experience in its constituent MAPF queries. In particular, exRHCR employs a new extension of Priority-Based Search (PBS), a state-of-the-art MAPF solver. Our extension, called exPBS, allows to warm-start the search with the priorities between agents used by PBS in the previous MAPF instances. We demonstrate empirically that exRHCR solves L-MAPF up to 25% faster than RHCR, and allows to increase throughput for given task streams by as much as 3%-16% by increasing the number of agents we can cope with for a given time budget.