Abstract:From prehistoric encirclement for hunting to GPS orbiting the earth for positioning, target encirclement has numerous real world applications. However, encircling multiple non-cooperative targets in GPS-denied environments remains challenging. In this work, multiple targets encirclement by using a minimum of two tasking agents, is considered where the relative distance measurements between the agents and the targets can be obtained by using onboard sensors. Based on the measurements, the center of all the targets is estimated directly by a fuzzy wavelet neural network (FWNN) and the least squares fit method. Then, a new distributed anti-synchronization controller (DASC) is designed so that the two tasking agents are able to encircle all targets while staying opposite to each other. In particular, the radius of the desired encirclement trajectory can be dynamically determined to avoid potential collisions between the two agents and all targets. Based on the Lyapunov stability analysis method, the convergence proofs of the neural network prediction error, the target-center position estimation error, and the controller error are addressed respectively. Finally, both numerical simulations and UAV flight experiments are conducted to demonstrate the validity of the encirclement algorithms. The flight tests recorded video and other simulation results can be found in https://youtu.be/B8uTorBNrl4.
Abstract:Recent advancements in 3D reconstruction and neural rendering have enhanced the creation of high-quality digital assets, yet existing methods struggle to generalize across varying object shapes, textures, and occlusions. While Next Best View (NBV) planning and Learning-based approaches offer solutions, they are often limited by predefined criteria and fail to manage occlusions with human-like common sense. To address these problems, we present AIR-Embodied, a novel framework that integrates embodied AI agents with large-scale pretrained multi-modal language models to improve active 3DGS reconstruction. AIR-Embodied utilizes a three-stage process: understanding the current reconstruction state via multi-modal prompts, planning tasks with viewpoint selection and interactive actions, and employing closed-loop reasoning to ensure accurate execution. The agent dynamically refines its actions based on discrepancies between the planned and actual outcomes. Experimental evaluations across virtual and real-world environments demonstrate that AIR-Embodied significantly enhances reconstruction efficiency and quality, providing a robust solution to challenges in active 3D reconstruction.
Abstract:This paper proposes a comprehensive strategy for complex multi-target-multi-drone encirclement in an obstacle-rich and GPS-denied environment, motivated by practical scenarios such as pursuing vehicles or humans in urban canyons. The drones have omnidirectional range sensors that can robustly detect ground targets and obtain noisy relative distances. After each drone task is assigned, a novel distance-based target state estimator (DTSE) is proposed by estimating the measurement output noise variance and utilizing the Kalman filter. By integrating anti-synchronization techniques and pseudo-force functions, an acceleration controller enables two tasking drones to cooperatively encircle a target from opposing positions while navigating obstacles. The algorithms effectiveness for the discrete-time double-integrator system is established theoretically, particularly regarding observability. Moreover, the versatility of the algorithm is showcased in aerial-to-ground scenarios, supported by compelling simulation results. Experimental validation demonstrates the effectiveness of the proposed approach.
Abstract:In response to the evolving challenges posed by small unmanned aerial vehicles (UAVs), which possess the potential to transport harmful payloads or independently cause damage, we introduce MMAUD: a comprehensive Multi-Modal Anti-UAV Dataset. MMAUD addresses a critical gap in contemporary threat detection methodologies by focusing on drone detection, UAV-type classification, and trajectory estimation. MMAUD stands out by combining diverse sensory inputs, including stereo vision, various Lidars, Radars, and audio arrays. It offers a unique overhead aerial detection vital for addressing real-world scenarios with higher fidelity than datasets captured on specific vantage points using thermal and RGB. Additionally, MMAUD provides accurate Leica-generated ground truth data, enhancing credibility and enabling confident refinement of algorithms and models, which has never been seen in other datasets. Most existing works do not disclose their datasets, making MMAUD an invaluable resource for developing accurate and efficient solutions. Our proposed modalities are cost-effective and highly adaptable, allowing users to experiment and implement new UAV threat detection tools. Our dataset closely simulates real-world scenarios by incorporating ambient heavy machinery sounds. This approach enhances the dataset's applicability, capturing the exact challenges faced during proximate vehicular operations. It is expected that MMAUD can play a pivotal role in advancing UAV threat detection, classification, trajectory estimation capabilities, and beyond. Our dataset, codes, and designs will be available in https://github.com/ntu-aris/MMAUD.