Abstract:We focus on the problem of rearranging a set of objects within a confined space with a nonholonomically constrained mobile robot pusher. This problem is relevant to many real-world domains, including warehouse automation and construction. These domains give rise to instances involving a combination of geometric, kinematic, and physics constraints, which make planning particularly challenging. Prior work often makes simplifying assumptions like the use of holonomic mobile robots or dexterous manipulators capable of unconstrained overhand reaching. Our key insight is we can empower even a constrained mobile pusher to tackle complex rearrangement tasks by enabling it to modify the environment to its favor in a constraint-aware fashion. To this end, we describe a Push-Traversability graph, whose vertices represent poses that the pusher can push objects from and edges represent optimal, kinematically feasible, and stable push-rearrangements of objects. Based on this graph, we develop ReloPush, a planning framework that leverages Dubins curves and standard graph search techniques to generate an efficient sequence of object rearrangements to be executed by the pusher. We evaluate ReloPush across a series of challenging scenarios, involving the rearrangement of densely cluttered workspaces with up to eight objects by a 1tenth mobile robot pusher. ReloPush exhibits orders of magnitude faster runtimes and significantly more robust execution in the real world, evidenced in lower execution times and fewer losses of object contact, compared to two baselines lacking our proposed graph structure.
Abstract:The University of Michigan Robotics program focuses on the study of embodied intelligence that must sense, reason, act, and work with people to improve quality of life and productivity equitably across society. ROB 204, part of the core curriculum towards the undergraduate degree in Robotics, introduces students to topics that enable conceptually designing a robotic system to address users' needs from a sociotechnical context. Students are introduced to human-robot interaction (HRI) concepts and the process for socially-engaged design with a Learn-Reinforce-Integrate approach. In this paper, we discuss the course topics and our teaching methodology, and provide recommendations for delivering this material. Overall, students leave the course with a new understanding and appreciation for how human capabilities can inform requirements for a robotics system, how humans can interact with a robot, and how to assess the usability of robotic systems.
Abstract:Social robot navigation algorithms are often demonstrated in overly simplified scenarios, prohibiting the extraction of practical insights about their relevance to real world domains. Our key insight is that an understanding of the inherent complexity of a social robot navigation scenario could help characterize the limitations of existing navigation algorithms and provide actionable directions for improvement. Through an exploration of recent literature, we identify a series of factors contributing to the complexity of a scenario, disambiguating between contextual and robot-related ones. We then conduct a simulation study investigating how manipulations of contextual factors impact the performance of a variety of navigation algorithms. We find that dense and narrow environments correlate most strongly with performance drops, while the heterogeneity of agent policies and directionality of interactions have a less pronounced effect. This motivates a shift towards developing and testing algorithms under higher-complexity settings.
Abstract:Off-road vehicles are susceptible to rollovers in terrains with large elevation features, such as steep hills, ditches, and berms. One way to protect them against rollovers is ruggedization through the use of industrial-grade parts and physical modifications. However, this solution can be prohibitively expensive for academic research labs. Our key insight is that a software-based rollover-prevention system (RPS) enables the use of commercial-off-the-shelf hardware parts that are cheaper than their industrial counterparts, thus reducing overall cost. In this paper, we present HOUND, a small-scale, inexpensive, off-road autonomy platform that can handle challenging outdoor terrains at high speeds through the integration of an RPS. HOUND is integrated with a complete stack for perception and control, geared towards aggressive offroad driving. We deploy HOUND in the real world, at high speeds, on four different terrains covering 50 km of driving and highlight its utility in preventing rollovers and traversing difficult terrain. Additionally, through integration with BeamNG, a state-of-the-art driving simulator, we demonstrate a significant reduction in rollovers without compromising turning ability across a series of simulated experiments. Supplementary material can be found on our website, where we will also release all design documents for the platform: https://sites.google.com/view/prl-hound .
Abstract:Cooking recipes are especially challenging to translate to robot plans as they feature rich linguistic complexity, temporally-extended interconnected tasks, and an almost infinite space of possible actions. Our key insight is that combining a source of background cooking domain knowledge with a formalism capable of handling the temporal richness of cooking recipes could enable the extraction of unambiguous, robot-executable plans. In this work, we use Linear Temporal Logic (LTL) as a formal language expressible enough to model the temporal nature of cooking recipes. Leveraging pre-trained Large Language Models (LLMs), we present a system that translates instruction steps from an arbitrary cooking recipe found on the internet to a series of LTL formulae, grounding high-level cooking actions to a set of primitive actions that are executable by a manipulator in a kitchen environment. Our approach makes use of a caching scheme, dynamically building a queryable action library at runtime, significantly decreasing LLM API calls (-51%), latency (-59%) and cost (-42%) compared to a baseline that queries the LLM for every newly encountered action at runtime. We demonstrate the transferability of our system in a realistic simulation platform through showcasing a set of simple cooking tasks.
Abstract:We focus on the problem of rearranging a set of objects with a team of car-like robot pushers built using off-the-shelf components. Maintaining control of pushed objects while avoiding collisions in a tight space demands highly coordinated motion that is challenging to execute on constrained hardware. Centralized replanning approaches become intractable even for small-sized problems whereas decentralized approaches often get stuck in deadlocks. Our key insight is that by carefully assigning pushing tasks to robots, we could reduce the complexity of the rearrangement task, enabling robust performance via scalable decentralized control. Based on this insight, we built PuSHR, a system that optimally assigns pushing tasks and trajectories to robots offline, and performs trajectory tracking via decentralized control online. Through an ablation study in simulation, we demonstrate that PuSHR dominates baselines ranging from purely decentralized to fully decentralized in terms of success rate and time efficiency across challenging tasks with up to 4 robots. Hardware experiments demonstrate the transfer of our system to the real world and highlight its robustness to model inaccuracies. Our code can be found at https://github.com/prl-mushr/pushr, and videos from our experiments at https://youtu.be/DIWmZerF_O8.
Abstract:We focus on robot navigation in crowded environments. To navigate safely and efficiently within crowds, robots need models for crowd motion prediction. Building such models is hard due to the high dimensionality of multiagent domains and the challenge of collecting or simulating interaction-rich crowd-robot demonstrations. While there has been important progress on models for offline pedestrian motion forecasting, transferring their performance on real robots is nontrivial due to close interaction settings and novelty effects on users. In this paper, we investigate the utility of a recent state-of-the-art motion prediction model (S-GAN) for crowd navigation tasks. We incorporate this model into a model predictive controller (MPC) and deploy it on a self-balancing robot which we subject to a diverse range of crowd behaviors in the lab. We demonstrate that while S-GAN motion prediction accuracy transfers to the real world, its value is not reflected on navigation performance, measured with respect to safety and efficiency; in fact, the MPC performs indistinguishably even when using a simple constant-velocity prediction model, suggesting that substantial model improvements might be needed to yield significant gains for crowd navigation tasks. Footage from our experiments can be found at https://youtu.be/mzFiXg8KsZ0.
Abstract:Highly articulated organisms serve as blueprints for incredibly dexterous mechanisms, but building similarly capable robotic counterparts has been hindered by the difficulties of developing electromechanical actuators with both the high strength and compactness of biological muscle. We develop a stackable electrostatic brake that has comparable specific tension and weight to that of muscles and integrate it into a robotic joint. Compared to electromechanical motors, our brake-equipped joint is four times lighter and one thousand times more power efficient while exerting similar holding torques. Our joint design enables a ten degree-of-freedom robot equipped with only one motor to manipulate multiple objects simultaneously. We also show that the use of brakes allows a two-fingered robot to perform in-hand re-positioning of an object 45% more quickly and with 53% lower positioning error than without brakes. Relative to fully actuated robots, our findings suggest that robots equipped with such electrostatic brakes will have lower weight, volume, and power consumption yet retain the ability to reach arbitrary joint configurations.
Abstract:During in-hand manipulation, robots must be able to continuously estimate the pose of the object in order to generate appropriate control actions. The performance of algorithms for pose estimation hinges on the robot's sensors being able to detect discriminative geometric object features, but previous sensing modalities are unable to make such measurements robustly. The robot's fingers can occlude the view of environment- or robot-mounted image sensors, and tactile sensors can only measure at the local areas of contact. Motivated by fingertip-embedded proximity sensors' robustness to occlusion and ability to measure beyond the local areas of contact, we present the first evaluation of proximity sensor based pose estimation for in-hand manipulation. We develop a novel two-fingered hand with fingertip-embedded optical time-of-flight proximity sensors as a testbed for pose estimation during planar in-hand manipulation. Here, the in-hand manipulation task consists of the robot moving a cylindrical object from one end of its workspace to the other. We demonstrate, with statistical significance, that proximity-sensor based pose estimation via particle filtering during in-hand manipulation: a) exhibits 50% lower average pose error than a tactile-sensor based baseline; b) empowers a model predictive controller to achieve 30% lower final positioning error compared to when using tactile-sensor based pose estimates.
Abstract:We focus on the problem of analyzing multiagent interactions in traffic domains. Understanding the space of behavior of real-world traffic may offer significant advantages for algorithmic design, data-driven methodologies, and benchmarking. However, the high dimensionality of the space and the stochasticity of human behavior may hinder the identification of important interaction patterns. Our key insight is that traffic environments feature significant geometric and temporal structure, leading to highly organized collective behaviors, often drawn from a small set of dominant modes. In this work, we propose a representation based on the formalism of topological braids that can summarize arbitrarily complex multiagent behavior into a compact object of dual geometric and symbolic nature, capturing critical events of interaction. This representation allows us to formally enumerate the space of outcomes in a traffic scene and characterize their complexity. We illustrate the value of the proposed representation in summarizing critical aspects of real-world traffic behavior through a case study on recent driving datasets. We show that despite the density of real-world traffic, observed behavior tends to follow highly organized patterns of low interaction. Our framework may be a valuable tool for evaluating the richness of driving datasets, but also for synthetically designing balanced training datasets or benchmarks.