Abstract:Neural implicit representations have recently shown promising progress in dense Simultaneous Localization And Mapping (SLAM). However, existing works have shortcomings in terms of reconstruction quality and real-time performance, mainly due to inflexible scene representation strategy without leveraging any prior information. In this paper, we introduce SP-SLAM, a novel neural RGB-D SLAM system that performs tracking and mapping in real-time. SP-SLAM computes depth images and establishes sparse voxel-encoded scene priors near the surfaces to achieve rapid convergence of the model. Subsequently, the encoding voxels computed from single-frame depth image are fused into a global volume, which facilitates high-fidelity surface reconstruction. Simultaneously, we employ tri-planes to store scene appearance information, striking a balance between achieving high-quality geometric texture mapping and minimizing memory consumption. Furthermore, in SP-SLAM, we introduce an effective optimization strategy for mapping, allowing the system to continuously optimize the poses of all historical input frames during runtime without increasing computational overhead. We conduct extensive evaluations on five benchmark datasets (Replica, ScanNet, TUM RGB-D, Synthetic RGB-D, 7-Scenes). The results demonstrate that, compared to existing methods, we achieve superior tracking accuracy and reconstruction quality, while running at a significantly faster speed.
Abstract:Face anonymization aims to conceal the visual identity of a face to safeguard the individual's privacy. Traditional methods like blurring and pixelation can largely remove identifying features, but these techniques significantly degrade image quality and are vulnerable to deep reconstruction attacks. Generative models have emerged as a promising solution for anonymizing faces while preserving a natural appearance.However, many still face limitations in visual quality and often overlook the potential to recover the original face from the anonymized version, which can be valuable in specific contexts such as image forensics. This paper proposes a novel framework named iFADIT, an acronym for Invertible Face Anonymization via Disentangled Identity Transform.The framework features a disentanglement architecture coupled with a secure flow-based model: the former decouples identity information from non-identifying attributes, while the latter transforms the decoupled identity into an anonymized version in an invertible manner controlled by a secret key. The anonymized face can then be reconstructed based on a pre-trained StyleGAN that ensures high image quality and realistic facial details. Recovery of the original face (aka de-anonymization) is possible upon the availability of the matching secret, by inverting the anonymization process based on the same set of model parameters. Furthermore, a dedicated secret-key mechanism along with a dual-phase training strategy is devised to ensure the desired properties of face anonymization. Qualitative and quantitative experiments demonstrate the superiority of the proposed approach in anonymity, reversibility, security, diversity, and interpretability over competing methods.
Abstract:This article propose a whole-body impedance coordinative control framework for a wheel-legged humanoid robot to achieve adaptability on complex terrains while maintaining robot upper body stability. The framework contains a bi-level control strategy. The outer level is a variable damping impedance controller, which optimizes the damping parameters to ensure the stability of the upper body while holding an object. The inner level employs Whole-Body Control (WBC) optimization that integrates real-time terrain estimation based on wheel-foot position and force data. It generates motor torques while accounting for dynamic constraints, joint limits,friction cones, real-time terrain updates, and a model-free friction compensation strategy. The proposed whole-body coordinative control method has been tested on a recently developed quadruped humanoid robot. The results demonstrate that the proposed algorithm effectively controls the robot, maintaining upper body stability to successfully complete a water-carrying task while adapting to varying terrains.
Abstract:This article explores human-like movement from a fresh perspective on motion planning. We analyze the coordinated and compliant movement mechanisms of the human body from the perspective of biomechanics. Based on these mechanisms, we propose an optimal control framework that integrates compliant control dynamics, optimizing robotic arm motion through a response time matrix. This matrix sets the timing parameters for joint movements, turning the system into a time-parameterized optimal control problem. The model focuses on the interaction between active and passive joints under external disturbances, improving adaptability and compliance. This method achieves optimal trajectory generation and balances precision and compliance. Experimental results on both a manipulator and a humanoid robot validate the approach.
Abstract:This paper proposes a fair control framework for multi-robot systems, which integrates the newly introduced Alternative Authority Control (AAC) and Flexible Control Barrier Function (F-CBF). Control authority refers to a single robot which can plan its trajectory while considering others as moving obstacles, meaning the other robots do not have authority to plan their own paths. The AAC method dynamically distributes the control authority, enabling fair and coordinated movement across the system. This approach significantly improves computational efficiency, scalability, and robustness in complex environments. The proposed F-CBF extends traditional CBFs by incorporating obstacle shape, velocity, and orientation. F-CBF enhances safety by accurate dynamic obstacle avoidance. The framework is validated through simulations in multi-robot scenarios, demonstrating its safety, robustness and computational efficiency.
Abstract:Summarizing knowledge from animals and human beings inspires robotic innovations. In this work, we propose a framework for driving legged robots act like real animals with lifelike agility and strategy in complex environments. Inspired by large pre-trained models witnessed with impressive performance in language and image understanding, we introduce the power of advanced deep generative models to produce motor control signals stimulating legged robots to act like real animals. Unlike conventional controllers and end-to-end RL methods that are task-specific, we propose to pre-train generative models over animal motion datasets to preserve expressive knowledge of animal behavior. The pre-trained model holds sufficient primitive-level knowledge yet is environment-agnostic. It is then reused for a successive stage of learning to align with the environments by traversing a number of challenging obstacles that are rarely considered in previous approaches, including creeping through narrow spaces, jumping over hurdles, freerunning over scattered blocks, etc. Finally, a task-specific controller is trained to solve complex downstream tasks by reusing the knowledge from previous stages. Enriching the knowledge regarding each stage does not affect the usage of other levels of knowledge. This flexible framework offers the possibility of continual knowledge accumulation at different levels. We successfully apply the trained multi-level controllers to the MAX robot, a quadrupedal robot developed in-house, to mimic animals, traverse complex obstacles, and play in a designed challenging multi-agent Chase Tag Game, where lifelike agility and strategy emerge on the robots. The present research pushes the frontier of robot control with new insights on reusing multi-level pre-trained knowledge and solving highly complex downstream tasks in the real world.
Abstract:Wireless sensor networks (WSNs) are self-organizing monitoring networks with a large number of randomly deployed microsensor nodes to collect various physical information to realize tasks such as intelligent perception, efficient control, and decision-making. However, WSN nodes are powered by batteries, so they will run out of energy after a certain time. This energy limitation will greatly constrain the network performance like network lifetime and energy efficiency. In this study, to prolong the network lifetime, we proposed a multi-hop routing protocol based on game theory and coverage optimization (MRP-GTCO). Briefly, in the stage of setup, two innovational strategies including a clustering game with penalty function and cluster head coverage set were designed to realize the uniformity of cluster head distribution and improve the rationality of cluster head election. In the data transmission stage, we first derived the applicable conditions theorem of inter-cluster multi-hop routing. Based on this, a novel multi-hop path selection algorithm related to residual energy and node degree was proposed to provide an energy-efficient data transmission path. The simulation results showed that the MRP-GTCO protocol can effectively reduce the network energy consumption and extend the network lifetime by 159.22%, 50.76%, and 16.46% compared with LGCA, RLEACH, and ECAGT protocols.
Abstract:Wireless sensor networks (WSNs), one of the fundamental technologies of the Internet of Things (IoT), can provide sensing and communication services efficiently for IoT-based applications, especially energy-limited applications. Clustering routing protocol plays an important role in reducing energy consumption and prolonging network lifetime. The cluster formation and cluster head selection are the key to improving the performance of the clustering routing protocol. An energy-efficient routing protocol based on multi-threshold segmentation (EERPMS) was proposed in this paper to improve the rationality of cluster formation and cluster head selection. In the stage of cluster formation, inspired by multi-threshold image segmentation, an innovative node clustering algorithm was developed. In the stage of cluster head selection, aiming at minimizing the network energy consumption, a calculation theory of the optimal number and location of cluster heads was established. Furthermore, a novel cluster head selection algorithm was constructed based on the residual energy and optimal location of cluster heads. Simulation results show that EERPMS can improve the distribution uniformity of cluster heads, prolong the network lifetime and save up to 64.50%, 58.60%, and 56.15% network energy as compared to RLEACH, CRPFCM, and FIGWO protocols respectively.
Abstract:The flying ad hoc network (FANET) will play a crucial role in the B5G/6G era since it provides wide coverage and on-demand deployment services in a distributed manner. The detection of Sybil attacks is essential to ensure trusted communication in FANET. Nevertheless, the conventional methods only utilize the untrusted information that UAV nodes passively ``heard'' from the ``auditory" domain (AD), resulting in severe communication disruptions and even collision accidents. In this paper, we present a novel VA-matching solution that matches the neighbors observed from both the AD and the ``visual'' domain (VD), which is the first solution that enables UAVs to accurately correlate what they ``see'' from VD and ``hear'' from AD to detect the Sybil attacks. Relative entropy is utilized to describe the similarity of observed characteristics from dual domains. The dynamic weight algorithm is proposed to distinguish neighbors according to the characteristics' popularity. The matching model of neighbors observed from AD and VD is established and solved by the vampire bat optimizer. Experiment results show that the proposed VA-matching solution removes the unreliability of individual characteristics and single domains. It significantly outperforms the conventional RSSI-based method in detecting Sybil attacks. Furthermore, it has strong robustness and achieves high precision and recall rates.
Abstract:Automatic segmentation of curvilinear objects in medical images plays an important role in the diagnosis and evaluation of human diseases, yet it is a challenging uncertainty for the complex segmentation task due to different issues like various image appearance, low contrast between curvilinear objects and their surrounding backgrounds, thin and uneven curvilinear structures, and improper background illumination. To overcome these challenges, we present a unique curvilinear structure segmentation framework based on oriented derivative of stick (ODoS) filter and deep learning network for curvilinear object segmentation in medical images. Currently, a large number of deep learning models emphasis on developing deep architectures and ignore capturing the structural features of curvature objects, which may lead to unsatisfactory results. In consequence, a new approach that incorporates the ODoS filter as part of a deep learning network is presented to improve the spatial attention of curvilinear objects. In which, the original image is considered as principal part to describe various image appearance and complex background illumination, the multi-step strategy is used to enhance contrast between curvilinear objects and their surrounding backgrounds, and the vector field is applied to discriminate thin and uneven curvilinear structures. Subsequently, a deep learning framework is employed to extract varvious structural features for curvilinear object segmentation in medical images. The performance of the computational model was validated in experiments with publicly available DRIVE, STARE and CHASEDB1 datasets. Experimental results indicate that the presented model has yielded surprising results compared with some state-of-the-art methods.