Abstract:We propose a new method named LoD-Loc for visual localization in the air. Unlike existing localization algorithms, LoD-Loc does not rely on complex 3D representations and can estimate the pose of an Unmanned Aerial Vehicle (UAV) using a Level-of-Detail (LoD) 3D map. LoD-Loc mainly achieves this goal by aligning the wireframe derived from the LoD projected model with that predicted by the neural network. Specifically, given a coarse pose provided by the UAV sensor, LoD-Loc hierarchically builds a cost volume for uniformly sampled pose hypotheses to describe pose probability distribution and select a pose with maximum probability. Each cost within this volume measures the degree of line alignment between projected and predicted wireframes. LoD-Loc also devises a 6-DoF pose optimization algorithm to refine the previous result with a differentiable Gaussian-Newton method. As no public dataset exists for the studied problem, we collect two datasets with map levels of LoD3.0 and LoD2.0, along with real RGB queries and ground-truth pose annotations. We benchmark our method and demonstrate that LoD-Loc achieves excellent performance, even surpassing current state-of-the-art methods that use textured 3D models for localization. The code and dataset are available at https://victorzoo.github.io/LoD-Loc.github.io/.
Abstract:The proliferation of sensors brings an immense volume of spatio-temporal (ST) data in many domains for various purposes, including monitoring, diagnostics, and prognostics applications. Data curation is a time-consuming process for a large volume of data, making it challenging and expensive to deploy data analytics platforms in new environments. Transfer learning (TL) mechanisms promise to mitigate data sparsity and model complexity by utilizing pre-trained models for a new task. Despite the triumph of TL in fields like computer vision and natural language processing, efforts on complex ST models for anomaly detection (AD) applications are limited. In this study, we present the potential of TL within the context of AD for the Hadron Calorimeter of the Compact Muon Solenoid experiment at CERN. We have transferred the ST AD models trained on data collected from one part of a calorimeter to another. We have investigated different configurations of TL on semi-supervised autoencoders of the ST AD models -- transferring convolutional, graph, and recurrent neural networks of both the encoder and decoder networks. The experiment results demonstrate that TL effectively enhances the model learning accuracy on a target subdetector. The TL achieves promising data reconstruction and AD performance while substantially reducing the trainable parameters of the AD models. It also improves robustness against anomaly contamination in the training data sets of the semi-supervised AD models.
Abstract:This paper presents the development and assessment of a teleoperation framework for robot-assisted minimally invasive surgery (MIS). The framework leverages our novel integration of an adaptive remote center of motion (RCM) using admittance control. This framework operates within a redundancy resolution method specifically designed for the RCM constraint. We introduce a compact, low-cost, and modular custom-designed instrument module (IM) that ensures integration with the manipulator, featuring a force-torque sensor, a surgical instrument, and an actuation unit for driving the surgical instrument. The paper details the complete teleoperation framework, including the telemanipulation trajectory mapping, kinematic modelling, control strategy, and the integrated admittance controller. Finally, the system capability to perform various surgical tasks was demonstrated, including passing a thread through the rings, picking and placing objects, and trajectory tracking.
Abstract:This paper proposes a hybrid force-motion framework that utilizes real-time surface normal updates. The surface normal is estimated via a novel method that leverages force sensing measurements and velocity commands to compensate the friction bias. This approach is critical for robust execution of precision force-controlled tasks in manufacturing, such as thermoplastic tape replacement that traces surfaces or paths on a workpiece subject to uncertainties deviated from the model. We formulated the proposed method and implemented the framework in ROS2 environment. The approach was validated using kinematic simulations and a hardware platform. Specifically, we demonstrated the approach on a 7-DoF manipulator equipped with a force/torque sensor at the end-effector.
Abstract:In laparoscopic robot-assisted minimally invasive surgery, the kinematic control of the robot is subject to the remote center of motion (RCM) constraint at the port of entry (e.g., trocar) into the patient's body. During surgery, after the instrument is inserted through the trocar, intrinsic physiological movements such as the patient's heartbeat, breathing process, and/or other purposeful body repositioning may deviate the position of the port of entry. This can cause a conflict between the registered RCM and the moved port of entry. To mitigate this conflict, we seek to utilize the interaction forces at the RCM. We develop a novel framework that integrates admittance control into a redundancy resolution method for the RCM kinematic constraint. Using the force/torque sensory feedback at the base of the instrument driving mechanism (IDM), the proposed framework estimates the forces at RCM, rejects forces applied on other locations along the instrument, and uses them in the admittance controller. In this paper, we report analysis from kinematic simulations to validate the proposed framework. In addition, a hardware platform has been completed, and future work is planned for experimental validation.
Abstract:In an effort to lower the barrier to entry in underwater manipulation, this paper presents an open-source, user-friendly framework for bimanual teleoperation of a light-duty underwater vehicle-manipulator system (UVMS). This framework allows for the control of the vehicle along with two manipulators and their end-effectors using two low-cost haptic devices. The UVMS kinematics are derived in order to create an independent resolved motion rate controller for each manipulator, which optimally controls the joint positions to achieve a desired end-effector pose. This desired pose is computed in real-time using a teleoperation controller developed to process the dual haptic device input from the user. A physics-based simulation environment is used to implement this framework for two example tasks as well as provide data for error analysis of user commands. The first task illustrates the functionality of the framework through motion control of the vehicle and manipulators using only the haptic devices. The second task is to grasp an object using both manipulators simultaneously, demonstrating precision and coordination using the framework. The framework code is available at https://github.com/stevens-armlab/uvms_bimanual_sim.
Abstract:Recently, there has been a growing interest in rescue robots due to their vital role in addressing emergency scenarios and providing crucial support in challenging or hazardous situations where human intervention is difficult. However, very few of these robots are capable of actively engaging with humans and undertaking physical manipulation tasks. This limitation is largely attributed to the absence of tools that can realistically simulate physical interactions, especially the contact mechanisms between a robotic gripper and a human body. In this letter, we aim to address key limitations in current developments towards robotic casualty manipulation. Firstly, we present an integrative simulation framework for casualty manipulation. We adapt a finite element method (FEM) tool into the grasping and manipulation scenario, and the developed framework can provide accurate biomechanical reactions resulting from manipulation. Secondly, we conduct a detailed assessment of grasping stability during casualty grasping and manipulation simulations. To validate the necessity and superior performance of the proposed high-fidelity simulation framework, we conducted a qualitative and quantitative comparison of grasping stability analyses between the proposed framework and the state-of-the-art multi-body physics simulations. Through these efforts, we have taken the first step towards a feasible solution for robotic casualty manipulation.
Abstract:Evolutionary Game Theory (EGT) and Artificial Intelligence (AI) are two fields that, at first glance, might seem distinct, but they have notable connections and intersections. The former focuses on the evolution of behaviors (or strategies) in a population, where individuals interact with others and update their strategies based on imitation (or social learning). The more successful a strategy is, the more prevalent it becomes over time. The latter, meanwhile, is centered on machine learning algorithms and (deep) neural networks. It is often from a single-agent perspective but increasingly involves multi-agent environments, in which intelligent agents adjust their strategies based on feedback and experience, somewhat akin to the evolutionary process yet distinct in their self-learning capacities. In light of the key components necessary to address real-world problems, including (i) learning and adaptation, (ii) cooperation and competition, (iii) robustness and stability, and altogether (iv) population dynamics of individual agents whose strategies evolve, the cross-fertilization of ideas between both fields will contribute to the advancement of mathematics of multi-agent learning systems, in particular, to the nascent domain of ``collective cooperative intelligence'' bridging evolutionary dynamics and multi-agent reinforcement learning.
Abstract:In past years, we have been dedicated to automating user acceptance testing (UAT) process of WeChat Pay, one of the most influential mobile payment applications in China. A system titled XUAT has been developed for this purpose. However, there is still a human-labor-intensive stage, i.e, test scripts generation, in the current system. Therefore, in this paper, we concentrate on methods of boosting the automation level of the current system, particularly the stage of test scripts generation. With recent notable successes, large language models (LLMs) demonstrate significant potential in attaining human-like intelligence and there has been a growing research area that employs LLMs as autonomous agents to obtain human-like decision-making capabilities. Inspired by these works, we propose an LLM-powered multi-agent collaborative system, named XUAT-Copilot, for automated UAT. The proposed system mainly consists of three LLM-based agents responsible for action planning, state checking and parameter selecting, respectively, and two additional modules for state sensing and case rewriting. The agents interact with testing device, make human-like decision and generate action command in a collaborative way. The proposed multi-agent system achieves a close effectiveness to human testers in our experimental studies and gains a significant improvement of Pass@1 accuracy compared with single-agent architecture. More importantly, the proposed system has launched in the formal testing environment of WeChat Pay mobile app, which saves a considerable amount of manpower in the daily development work.
Abstract:The compact muon solenoid (CMS) experiment is a general-purpose detector for high-energy collision at the large hadron collider (LHC) at CERN. It employs an online data quality monitoring (DQM) system to promptly spot and diagnose particle data acquisition problems to avoid data quality loss. In this study, we present semi-supervised spatio-temporal anomaly detection (AD) monitoring for the physics particle reading channels of the hadronic calorimeter (HCAL) of the CMS using three-dimensional digi-occupancy map data of the DQM. We propose the GraphSTAD system, which employs convolutional and graph neural networks to learn local spatial characteristics induced by particles traversing the detector, and global behavior owing to shared backend circuit connections and housing boxes of the channels, respectively. Recurrent neural networks capture the temporal evolution of the extracted spatial features. We have validated the accuracy of the proposed AD system in capturing diverse channel fault types using the LHC Run-2 collision data sets. The GraphSTAD system has achieved production-level accuracy and is being integrated into the CMS core production system--for real-time monitoring of the HCAL. We have also provided a quantitative performance comparison with alternative benchmark models to demonstrate the promising leverage of the presented system.