Abstract:Although the manipulating of the unmanned aerial manipulator (UAM) has been widely studied, vision-based UAM approaching, which is crucial to the subsequent manipulating, generally lacks effective design. The key to the visual UAM approaching lies in object tracking, while current UAM tracking typically relies on costly model-based methods. Besides, UAM approaching often confronts more severe object scale variation issues, which makes it inappropriate to directly employ state-of-the-art model-free Siamese-based methods from the object tracking field. To address the above problems, this work proposes a novel Siamese network with pairwise scale-channel attention (SiamSA) for vision-based UAM approaching. Specifically, SiamSA consists of a pairwise scale-channel attention network (PSAN) and a scale-aware anchor proposal network (SA-APN). PSAN acquires valuable scale information for feature processing, while SA-APN mainly attaches scale awareness to anchor proposing. Moreover, a new tracking benchmark for UAM approaching, namely UAMT100, is recorded with 35K frames on a flying UAM platform for evaluation. Exhaustive experiments on the benchmarks and real-world tests validate the efficiency and practicality of SiamSA with a promising speed. Both the code and UAMT100 benchmark are now available at https://github.com/vision4robotics/SiamSA.
Abstract:Aerial tracking, which has exhibited its omnipresent dedication and splendid performance, is one of the most active applications in the remote sensing field. Especially, unmanned aerial vehicle (UAV)-based remote sensing system, equipped with a visual tracking approach, has been widely used in aviation, navigation, agriculture, transportation, and public security, etc. As is mentioned above, the UAV-based aerial tracking platform has been gradually developed from research to practical application stage, reaching one of the main aerial remote sensing technologies in the future. However, due to real-world challenging situations, the vibration of the UAV's mechanical structure (especially under strong wind conditions), and limited computation resources, accuracy, robustness, and high efficiency are all crucial for the onboard tracking methods. Recently, the discriminative correlation filter (DCF)-based trackers have stood out for their high computational efficiency and appealing robustness on a single CPU, and have flourished in the UAV visual tracking community. In this work, the basic framework of the DCF-based trackers is firstly generalized, based on which, 20 state-of-the-art DCF-based trackers are orderly summarized according to their innovations for soloving various issues. Besides, exhaustive and quantitative experiments have been extended on various prevailing UAV tracking benchmarks, i.e., UAV123, UAV123_10fps, UAV20L, UAVDT, DTB70, and VisDrone2019-SOT, which contain 371,625 frames in total. The experiments show the performance, verify the feasibility, and demonstrate the current challenges of DCF-based trackers onboard UAV tracking. Finally, comprehensive conclusions on open challenges and directions for future research is presented.
Abstract:Most existing trackers based on discriminative correlation filters (DCF) try to introduce predefined regularization term to improve the learning of target objects, e.g., by suppressing background learning or by restricting change rate of correlation filters. However, predefined parameters introduce much effort in tuning them and they still fail to adapt to new situations that the designer did not think of. In this work, a novel approach is proposed to online automatically and adaptively learn spatio-temporal regularization term. Spatially local response map variation is introduced as spatial regularization to make DCF focus on the learning of trust-worthy parts of the object, and global response map variation determines the updating rate of the filter. Extensive experiments on four UAV benchmarks have proven the superiority of our method compared to the state-of-the-art CPU- and GPU-based trackers, with a speed of ~60 frames per second running on a single CPU. Our tracker is additionally proposed to be applied in UAV localization. Considerable tests in the indoor practical scenarios have proven the effectiveness and versatility of our localization method. The code is available at https://github.com/vision4robotics/AutoTrack.