Abstract:Deep learning has shown promising results for multiple 3D point cloud registration datasets. However, in the underwater domain, most registration of multibeam echo-sounder (MBES) point cloud data are still performed using classical methods in the iterative closest point (ICP) family. In this work, we curate and release DotsonEast Dataset, a semi-synthetic MBES registration dataset constructed from an autonomous underwater vehicle in West Antarctica. Using this dataset, we systematically benchmark the performance of 2 classical and 4 learning-based methods. The experimental results show that the learning-based methods work well for coarse alignment, and are better at recovering rough transforms consistently at high overlap (20-50%). In comparison, GICP (a variant of ICP) performs well for fine alignment and is better across all metrics at extremely low overlap (10%). To the best of our knowledge, this is the first work to benchmark both learning-based and classical registration methods on an AUV-based MBES dataset. To facilitate future research, both the code and data are made available online.
Abstract:Side-scan sonar (SSS) is a lightweight acoustic sensor that is commonly deployed on autonomous underwater vehicles (AUVs) to provide high-resolution seafloor images. However, leveraging side-scan images for simultaneous localization and mapping (SLAM) presents a notable challenge, primarily due to the difficulty of establishing sufficient amount of accurate correspondences between these images. To address this, we introduce a novel subframe-based dense SLAM framework utilizing side-scan sonar data, enabling effective dense matching in overlapping regions of paired side-scan images. With each image being evenly divided into subframes, we propose a robust estimation pipeline to estimate the relative pose between each paired subframes, by using a good inlier set identified from dense correspondences. These relative poses are then integrated as edge constraints in a factor graph to optimize the AUV pose trajectory. The proposed framework is evaluated on three real datasets collected by a Hugin AUV. Among one of them includes manually-annotated keypoint correspondences as ground truth and is used for evaluation of pose trajectory. We also present a feasible way of evaluating mapping quality against multi-beam echosounder (MBES) data without the influence of pose. Experimental results demonstrate that our approach effectively mitigates drift from the dead-reckoning (DR) system and enables quasi-dense bathymetry reconstruction. An open-source implementation of this work is available.
Abstract:Acoustic sensors play an important role in autonomous underwater vehicles (AUVs). Sidescan sonar (SSS) detects a wide range and provides photo-realistic images in high resolution. However, SSS projects the 3D seafloor to 2D images, which are distorted by the AUV's altitude, target's range and sensor's resolution. As a result, the same physical area can show significant visual differences in SSS images from different survey lines, causing difficulties in tasks such as pixel correspondence and template matching. In this paper, a canonical transformation method consisting of intensity correction and slant range correction is proposed to decrease the above distortion. The intensity correction includes beam pattern correction and incident angle correction using three different Lambertian laws (cos, cos2, cot), whereas the slant range correction removes the nadir zone and projects the position of SSS elements into equally horizontally spaced, view-point independent bins. The proposed method is evaluated on real data collected by a HUGIN AUV, with manually-annotated pixel correspondence as ground truth reference. Experimental results on patch pairs compare similarity measures and keypoint descriptor matching. The results show that the canonical transformation can improve the patch similarity, as well as SIFT descriptor matching accuracy in different images where the same physical area was ensonified.
Abstract:Side-scan sonar (SSS) is a lightweight acoustic sensor that is frequently deployed on autonomous underwater vehicles (AUV) to provide high-resolution seafloor image. However, using side-scan images to perform simultaneous localization and mapping (SLAM) remains a challenge due to lack of 3D bathymetric information and the lack of discriminant features in the sidescan images. To tackle this, we propose a feature-based SLAM framework using side-scan sonar, which is able to automatically detect and robustly match keypoints between paired side-scan images. We then use the detected correspondences as constraints to optimize the AUV pose trajectory. The proposed method is evaluated on real data collected by a Hugin AUV, using as a ground truth reference both manually-annotated keypoints and a 3D bathymetry mesh from multibeam echosounder (MBES). Experimental results demonstrate that our approach is able to reduce drifts compared to the dead-reckoning system. The framework is made publicly available for the benefit of the community.
Abstract:Context-aware decision support in the operating room can foster surgical safety and efficiency by leveraging real-time feedback from surgical workflow analysis. Most existing works recognize surgical activities at a coarse-grained level, such as phases, steps or events, leaving out fine-grained interaction details about the surgical activity; yet those are needed for more helpful AI assistance in the operating room. Recognizing surgical actions as triplets of <instrument, verb, target> combination delivers comprehensive details about the activities taking place in surgical videos. This paper presents CholecTriplet2021: an endoscopic vision challenge organized at MICCAI 2021 for the recognition of surgical action triplets in laparoscopic videos. The challenge granted private access to the large-scale CholecT50 dataset, which is annotated with action triplet information. In this paper, we present the challenge setup and assessment of the state-of-the-art deep learning methods proposed by the participants during the challenge. A total of 4 baseline methods from the challenge organizers and 19 new deep learning algorithms by competing teams are presented to recognize surgical action triplets directly from surgical videos, achieving mean average precision (mAP) ranging from 4.2% to 38.1%. This study also analyzes the significance of the results obtained by the presented approaches, performs a thorough methodological comparison between them, in-depth result analysis, and proposes a novel ensemble method for enhanced recognition. Our analysis shows that surgical workflow analysis is not yet solved, and also highlights interesting directions for future research on fine-grained surgical activity recognition which is of utmost importance for the development of AI in surgery.
Abstract:Banking Trojans, botnets are primary drivers of financially-motivated cybercrime. In this paper, we first analyzed how an APT-based banking botnet works step by step through the whole lifecycle. Specifically, we present a multi-stage system that detects malicious banking botnet activities which potentially target the organizations. The system leverages Cyber Data Lake as well as multiple artificial intelligence techniques at different stages. The evaluation results using public datasets showed that Deep Learning based detections were highly successful compared with baseline models.