Technical University Munich
Abstract:Science laboratory automation enables accelerated discovery in life sciences and materials. However, it requires interdisciplinary collaboration to address challenges such as robust and flexible autonomy, reproducibility, throughput, standardization, the role of human scientists, and ethics. This article highlights these issues, reflecting perspectives from leading experts in laboratory automation across different disciplines of the natural sciences.
Abstract:Manipulability analysis is a methodology employed to assess the capacity of an articulated system, at a specific configuration, to produce motion or exert force in diverse directions. The conventional method entails generating a virtual ellipsoid using the system's configuration and model. Yet, this approach poses challenges when applied to systems such as the human body, where direct access to such information is limited, necessitating reliance on estimations. Any inaccuracies in these estimations can distort the ellipsoid's configuration, potentially compromising the accuracy of the manipulability assessment. To address this issue, this article extends the standard approach by introducing the concept of the manipulability pseudo-ellipsoid. Through a series of theoretical analyses, simulations, and experiments, the article demonstrates that the proposed method exhibits reduced sensitivity to noise in sensory information, consequently enhancing the robustness of the approach.
Abstract:This paper is about generating motion plans for high degree-of-freedom systems that account for collisions along the entire body. A particular class of mathematical programs with complementarity constraints become useful in this regard. Optimization-based planners can tackle confined-space trajectory planning while being cognizant of robot constraints. However, introducing obstacles in this setting transforms the formulation into a non-convex problem (oftentimes with ill-posed bilinear constraints), which is non-trivial in a real-time setting. To this end, we present the FLIQC (Fast LInear Quadratic Complementarity based) motion planner. Our planner employs a novel motion model that captures the entire rigid robot as well as the obstacle geometry and ensures non-penetration between the surfaces due to the imposed constraint. We perform thorough comparative studies with the state-of-the-art, which demonstrate improved performance. Extensive simulation and hardware experiments validate our claim of generating continuous and reactive motion plans at 1 kHz for modern collaborative robots with constant minimal parameters.
Abstract:Recent advancements in robotics have transformed industries such as manufacturing, logistics, surgery, and planetary exploration. A key challenge is developing efficient motion planning algorithms that allow robots to navigate complex environments while avoiding collisions and optimizing metrics like path length, sweep area, execution time, and energy consumption. Among the available algorithms, sampling-based methods have gained the most traction in both research and industry due to their ability to handle complex environments, explore free space, and offer probabilistic completeness along with other formal guarantees. Despite their widespread application, significant challenges still remain. To advance future planning algorithms, it is essential to review the current state-of-the-art solutions and their limitations. In this context, this work aims to shed light on these challenges and assess the development and applicability of sampling-based methods. Furthermore, we aim to provide an in-depth analysis of the design and evaluation of ten of the most popular planners across various scenarios. Our findings highlight the strides made in sampling-based methods while underscoring persistent challenges. This work offers an overview of the important ongoing research in robotic motion planning.
Abstract:A typical manipulation task consists of a manipulator equipped with a gripper to grasp and move an object with constraints on the motion of the hand-held object, which may be due to the nature of the task itself or from object-environment contacts. In this paper, we study the problem of computing joint torques and grasping forces for time-optimal motion of an object, while ensuring that the grasp is not lost and any constraints on the motion of the object, either due to dynamics, environment contact, or no-slip requirements, are also satisfied. We present a second-order cone program (SOCP) formulation of the time-optimal trajectory planning problem that considers nonlinear friction cone constraints at the hand-object and object-environment contacts. Since SOCPs are convex optimization problems that can be solved optimally in polynomial time using interior point methods, we can solve the trajectory optimization problem efficiently. We present simulation results on three examples, including a non-prehensile manipulation task, which shows the generality and effectiveness of our approach.
Abstract:Large Language Models (LLMs) have gained popularity in task planning for long-horizon manipulation tasks. To enhance the validity of LLM-generated plans, visual demonstrations and online videos have been widely employed to guide the planning process. However, for manipulation tasks involving subtle movements but rich contact interactions, visual perception alone may be insufficient for the LLM to fully interpret the demonstration. Additionally, visual data provides limited information on force-related parameters and conditions, which are crucial for effective execution on real robots. In this paper, we introduce an in-context learning framework that incorporates tactile and force-torque information from human demonstrations to enhance LLMs' ability to generate plans for new task scenarios. We propose a bootstrapped reasoning pipeline that sequentially integrates each modality into a comprehensive task plan. This task plan is then used as a reference for planning in new task configurations. Real-world experiments on two different sequential manipulation tasks demonstrate the effectiveness of our framework in improving LLMs' understanding of multi-modal demonstrations and enhancing the overall planning performance.
Abstract:Assembly is a crucial skill for robots in both modern manufacturing and service robotics. However, mastering transferable insertion skills that can handle a variety of high-precision assembly tasks remains a significant challenge. This paper presents a novel framework that utilizes diffusion models to generate 6D wrench for high-precision tactile robotic insertion tasks. It learns from demonstrations performed on a single task and achieves a zero-shot transfer success rate of 95.7% across various novel high-precision tasks. Our method effectively inherits the self-adaptability demonstrated by our previous work. In this framework, we address the frequency misalignment between the diffusion policy and the real-time control loop with a dynamic system-based filter, significantly improving the task success rate by 9.15%. Furthermore, we provide a practical guideline regarding the trade-off between diffusion models' inference ability and speed.
Abstract:In this study, we present a novel method for enhancing the computational efficiency of whole-body control for humanoid robots, a challenge accentuated by their high degrees of freedom. The reduced-dimension rigid body dynamics of a floating base robot is constructed by segmenting its kinematic chain into constrained and unconstrained chains, simplifying the dynamics of the unconstrained chain through the centroidal dynamics. The proposed dynamics model is possible to be applied to whole-body control methods, allowing the problem to be divided into two parts for more efficient computation. The efficiency of the framework is demonstrated by comparative experiments in simulations. The calculation results demonstrate a significant reduction in processing time, highlighting an improvement over the times reported in current methodologies. Additionally, the results also shows the computational efficiency increases as the degrees of freedom of robot model increases.
Abstract:Robotic assembly tasks are open challenges due to the long task horizon and complex part relations. Behavior trees (BTs) are increasingly used in robot task planning for their modularity and flexibility, but manually designing them can be effort-intensive. Large language models (LLMs) have recently been applied in robotic task planning for generating action sequences, but their ability to generate BTs has not been fully investigated. To this end, We propose LLM as BT-planner, a novel framework to leverage LLMs for BT generation in robotic assembly task planning and execution. Four in-context learning methods are introduced to utilize the natural language processing and inference capabilities of LLMs to produce task plans in BT format, reducing manual effort and ensuring robustness and comprehensibility. We also evaluate the performance of fine-tuned, fewer-parameter LLMs on the same tasks. Experiments in simulated and real-world settings show that our framework enhances LLMs' performance in BT generation, improving success rates in BT generation through in-context learning and supervised fine-tuning.
Abstract:In this work, we propose an LLM-based BT generation framework to leverage the strengths of both for sequential manipulation planning. To enable human-robot collaborative task planning and enhance intuitive robot programming by nonexperts, the framework takes human instructions to initiate the generation of action sequences and human feedback to refine BT generation in runtime. All presented methods within the framework are tested on a real robotic assembly example, which uses a gear set model from the Siemens Robot Assembly Challenge. We use a single manipulator with a tool-changing mechanism, a common practice in flexible manufacturing, to facilitate robust grasping of a large variety of objects. Experimental results are evaluated regarding success rate, logical coherence, executability, time consumption, and token consumption. To our knowledge, this is the first human-guided LLM-based BT generation framework that unifies various plausible ways of using LLMs to fully generate BTs that are executable on the real testbed and take into account granular knowledge of tool use.