Abstract:Human safety is critical in applications involving close human-robot interactions (HRI) and is a key aspect of physical compatibility between humans and robots. While measures of human safety in HRI exist, these mainly target industrial settings involving robotic manipulators. Less attention has been paid to settings where mobile robots and humans share the space. This paper introduces a new robot-centered directional framework of human safety. It is particularly useful for evaluating mobile robots as they operate in environments populated by multiple humans. The framework integrates several key metrics, such as each human's relative distance, speed, and orientation. The core novelty lies in the framework's flexibility to accommodate different application requirements while allowing for both the robot-centered and external observer points of view. We instantiate the framework by using RGB-D based vision integrated with a deep learning-based human detection pipeline to yield a generalized safety index (GSI) that instantaneously assesses human safety. We evaluate GSI's capability of producing appropriate, robust, and fine-grained safety measures in real-world experimental scenarios and compare its performance with extant safety models.
Abstract:This paper presents a novel approach to range-based cooperative localization for robot swarms in GPS-denied environments, addressing the limitations of current methods in noisy and sparse settings. We propose a robust multi-layered localization framework that combines shadow edge localization techniques with the strategic deployment of UAVs. This approach not only addresses the challenges associated with nonrigid and poorly connected graphs but also enhances the convergence rate of the localization process. We introduce two key concepts: the S1-Edge approach in our distributed protocol to address the rigidity problem of sparse graphs and the concept of a powerful UAV node to increase the sensing and localization capability of the multi-robot system. Our approach leverages the advantages of the distributed localization methods, enhancing scalability and adaptability in large robot networks. We establish theoretical conditions for the new S1-Edge that ensure solutions exist even in the presence of noise, thereby validating the effectiveness of shadow edge localization. Extensive simulation experiments confirm the superior performance of our method compared to state-of-the-art techniques, resulting in up to 95\% reduction in localization error, demonstrating substantial improvements in localization accuracy and robustness to sparse graphs. This work provides a decisive advancement in the field of multi-robot localization, offering a powerful tool for high-performance and reliable operations in challenging environments.
Abstract:We propose a distributed control law for a heterogeneous multi-robot coverage problem, where the robots could have different energy characteristics, such as capacity and depletion rates, due to their varying sizes, speeds, capabilities, and payloads. Existing energy-aware coverage control laws consider capacity differences but assume the battery depletion rate to be the same for all robots. In realistic scenarios, however, some robots can consume energy much faster than other robots; for instance, UAVs hover at different altitudes, and these changes could be dynamically updated based on their assigned tasks. Robots' energy capacities and depletion rates need to be considered to maximize the performance of a multi-robot system. To this end, we propose a new energy-aware controller based on Lloyd's algorithm to adapt the weights of the robots based on their energy dynamics and divide the area of interest among the robots accordingly. The controller is theoretically analyzed and extensively evaluated through simulations and real-world demonstrations in multiple realistic scenarios and compared with three baseline control laws to validate its performance and efficacy.
Abstract:In indoor environments, multi-robot visual (RGB-D) mapping and exploration hold immense potential for application in domains such as domestic service and logistics, where deploying multiple robots in the same environment can significantly enhance efficiency. However, there are two primary challenges: (1) the "ghosting trail" effect, which occurs due to overlapping views of robots impacting the accuracy and quality of point cloud reconstruction, and (2) the oversight of visual reconstructions in selecting the most effective frontiers for exploration. Given these challenges are interrelated, we address them together by proposing a new semi-distributed framework (SPACE) for spatial cooperation in indoor environments that enables enhanced coverage and 3D mapping. SPACE leverages geometric techniques, including "mutual awareness" and a "dynamic robot filter," to overcome spatial mapping constraints. Additionally, we introduce a novel spatial frontier detection system and map merger, integrated with an adaptive frontier assigner for optimal coverage balancing the exploration and reconstruction objectives. In extensive ROS-Gazebo simulations, SPACE demonstrated superior performance over state-of-the-art approaches in both exploration and mapping metrics.
Abstract:The use of autonomous systems in medical evacuation (MEDEVAC) scenarios is promising, but existing implementations overlook key insights from human-robot interaction (HRI) research. Studies on human-machine teams demonstrate that human perceptions of a machine teammate are critical in governing the machine's performance. Here, we present a mixed factorial design to assess human perceptions of a MEDEVAC robot in a simulated evacuation scenario. Participants were assigned to the role of casualty (CAS) or bystander (BYS) and subjected to three within-subjects conditions based on the MEDEVAC robot's operating mode: autonomous-slow (AS), autonomous-fast (AF), and teleoperation (TO). During each trial, a MEDEVAC robot navigated an 11-meter path, acquiring a casualty and transporting them to an ambulance exchange point while avoiding an idle bystander. Following each trial, subjects completed a questionnaire measuring their emotional states, perceived safety, and social compatibility with the robot. Results indicate a consistent main effect of operating mode on reported emotional states and perceived safety. Pairwise analyses suggest that the employment of the AF operating mode negatively impacted perceptions along these dimensions. There were no persistent differences between casualty and bystander responses.
Abstract:Offloading time-sensitive, computationally intensive tasks-such as advanced learning algorithms for autonomous driving-from vehicles to nearby edge servers, vehicle-to-infrastructure (V2I) systems, or other collaborating vehicles via vehicle-to-vehicle (V2V) communication enhances service efficiency. However, whence traversing the path to the destination, the vehicle's mobility necessitates frequent handovers among the access points (APs) to maintain continuous and uninterrupted wireless connections to maintain the network's Quality of Service (QoS). These frequent handovers subsequently lead to task migrations among the edge servers associated with the respective APs. This paper addresses the joint problem of task migration and access-point handover by proposing a deep reinforcement learning framework based on the Deep Deterministic Policy Gradient (DDPG) algorithm. A joint allocation method of communication and computation of APs is proposed to minimize computational load, service latency, and interruptions with the overarching goal of maximizing QoS. We implement and evaluate our proposed framework on simulated experiments to achieve smooth and seamless task switching among edge servers, ultimately reducing latency.
Abstract:Multi-robot coverage is crucial in numerous applications, including environmental monitoring, search and rescue operations, and precision agriculture. In modern applications, a multi-robot team must collaboratively explore unknown spatial fields in GPS-denied and extreme environments where global localization is unavailable. Coverage algorithms typically assume that the robot positions and the coverage environment are defined in a global reference frame. However, coordinating robot motion and ensuring coverage of the shared convex workspace without global localization is challenging. This paper proposes a novel anchor-oriented coverage (AOC) approach to generate dynamic localized Voronoi partitions based around a common anchor position. We further propose a consensus-based coordination algorithm that achieves agreement on the coverage workspace around the anchor in the robots' relative frames of reference. Through extensive simulations and real-world experiments, we demonstrate that the proposed anchor-oriented approach using localized Voronoi partitioning performs as well as the state-of-the-art coverage controller using GPS.
Abstract:Classification of different object surface material types can play a significant role in the decision-making algorithms for mobile robots and autonomous vehicles. RGB-based scene-level semantic segmentation has been well-addressed in the literature. However, improving material recognition using the depth modality and its integration with SLAM algorithms for 3D semantic mapping could unlock new potential benefits in the robotics perception pipeline. To this end, we propose a complementarity-aware deep learning approach for RGB-D-based material classification built on top of an object-oriented pipeline. The approach further integrates the ORB-SLAM2 method for 3D scene mapping with multiscale clustering of the detected material semantics in the point cloud map generated by the visual SLAM algorithm. Extensive experimental results with existing public datasets and newly contributed real-world robot datasets demonstrate a significant improvement in material classification and 3D clustering accuracy compared to state-of-the-art approaches for 3D semantic scene mapping.
Abstract:Robot systems in education can leverage Large language models' (LLMs) natural language understanding capabilities to provide assistance and facilitate learning. This paper proposes a multimodal interactive robot (PhysicsAssistant) built on YOLOv8 object detection, cameras, speech recognition, and chatbot using LLM to provide assistance to students' physics labs. We conduct a user study on ten 8th-grade students to empirically evaluate the performance of PhysicsAssistant with a human expert. The Expert rates the assistants' responses to student queries on a 0-4 scale based on Bloom's taxonomy to provide educational support. We have compared the performance of PhysicsAssistant (YOLOv8+GPT-3.5-turbo) with GPT-4 and found that the human expert rating of both systems for factual understanding is the same. However, the rating of GPT-4 for conceptual and procedural knowledge (3 and 3.2 vs 2.2 and 2.6, respectively) is significantly higher than PhysicsAssistant (p < 0.05). However, the response time of GPT-4 is significantly higher than PhysicsAssistant (3.54 vs 1.64 sec, p < 0.05). Hence, despite the relatively lower response quality of PhysicsAssistant than GPT-4, it has shown potential for being used as a real-time lab assistant to provide timely responses and can offload teachers' labor to assist with repetitive tasks. To the best of our knowledge, this is the first attempt to build such an interactive multimodal robotic assistant for K-12 science (physics) education.
Abstract:To tackle the "reality gap" encountered in Sim-to-Real transfer, this study proposes a diffusion-based framework that minimizes inconsistencies in grasping actions between the simulation settings and realistic environments. The process begins by training an adversarial supervision layout-to-image diffusion model(ALDM). Then, leverage the ALDM approach to enhance the simulation environment, rendering it with photorealistic fidelity, thereby optimizing robotic grasp task training. Experimental results indicate this framework outperforms existing models in both success rates and adaptability to new environments through improvements in the accuracy and reliability of visual grasping actions under a variety of conditions. Specifically, it achieves a 75\% success rate in grasping tasks under plain backgrounds and maintains a 65\% success rate in more complex scenarios. This performance demonstrates this framework excels at generating controlled image content based on text descriptions, identifying object grasp points, and demonstrating zero-shot learning in complex, unseen scenarios.