Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Hongbin Ma

Cross-PCR: A Robust Cross-Source Point Cloud Registration Framework

Dec 25, 2024

Guiyu Zhao, Zhentao Guo, Zewen Du, Hongbin Ma

Abstract:Due to the density inconsistency and distribution difference between cross-source point clouds, previous methods fail in cross-source point cloud registration. We propose a density-robust feature extraction and matching scheme to achieve robust and accurate cross-source registration. To address the density inconsistency between cross-source data, we introduce a density-robust encoder for extracting density-robust features. To tackle the issue of challenging feature matching and few correct correspondences, we adopt a loose-to-strict matching pipeline with a ``loose generation, strict selection'' idea. Under it, we employ a one-to-many strategy to loosely generate initial correspondences. Subsequently, high-quality correspondences are strictly selected to achieve robust registration through sparse matching and dense matching. On the challenging Kinect-LiDAR scene in the cross-source 3DCSR dataset, our method improves feature matching recall by 63.5 percentage points (pp) and registration recall by 57.6 pp. It also achieves the best performance on 3DMatch, while maintaining robustness under diverse downsampling densities.

* Accepted by AAAI 2025

Via

Access Paper or Ask Questions

LDA-AQU: Adaptive Query-guided Upsampling via Local Deformable Attention

Nov 29, 2024

Zewen Du, Zhenjiang Hu, Guiyu Zhao, Ying Jin, Hongbin Ma

Figure 1 for LDA-AQU: Adaptive Query-guided Upsampling via Local Deformable Attention

Figure 2 for LDA-AQU: Adaptive Query-guided Upsampling via Local Deformable Attention

Figure 3 for LDA-AQU: Adaptive Query-guided Upsampling via Local Deformable Attention

Figure 4 for LDA-AQU: Adaptive Query-guided Upsampling via Local Deformable Attention

Abstract:Feature upsampling is an essential operation in constructing deep convolutional neural networks. However, existing upsamplers either lack specific feature guidance or necessitate the utilization of high-resolution feature maps, resulting in a loss of performance and flexibility. In this paper, we find that the local self-attention naturally has the feature guidance capability, and its computational paradigm aligns closely with the essence of feature upsampling (\ie feature reassembly of neighboring points). Therefore, we introduce local self-attention into the upsampling task and demonstrate that the majority of existing upsamplers can be regarded as special cases of upsamplers based on local self-attention. Considering the potential semantic gap between upsampled points and their neighboring points, we further introduce the deformation mechanism into the upsampler based on local self-attention, thereby proposing LDA-AQU. As a novel dynamic kernel-based upsampler, LDA-AQU utilizes the feature of queries to guide the model in adaptively adjusting the position and aggregation weight of neighboring points, thereby meeting the upsampling requirements across various complex scenarios. In addition, LDA-AQU is lightweight and can be easily integrated into various model architectures. We evaluate the effectiveness of LDA-AQU across four dense prediction tasks: object detection, instance segmentation, panoptic segmentation, and semantic segmentation. LDA-AQU consistently outperforms previous state-of-the-art upsamplers, achieving performance enhancements of 1.7 AP, 1.5 AP, 2.0 PQ, and 2.5 mIoU compared to the baseline models in the aforementioned four tasks, respectively. Code is available at \url{https://github.com/duzw9311/LDA-AQU}.

* Accepted by ACM MM2024

Via

Access Paper or Ask Questions

Cross-Layer Feature Pyramid Transformer for Small Object Detection in Aerial Images

Jul 29, 2024

Zewen Du, Zhenjiang Hu, Guiyu Zhao, Ying Jin, Hongbin Ma

Figure 1 for Cross-Layer Feature Pyramid Transformer for Small Object Detection in Aerial Images

Figure 2 for Cross-Layer Feature Pyramid Transformer for Small Object Detection in Aerial Images

Figure 3 for Cross-Layer Feature Pyramid Transformer for Small Object Detection in Aerial Images

Figure 4 for Cross-Layer Feature Pyramid Transformer for Small Object Detection in Aerial Images

Abstract:Object detection in aerial images has always been a challenging task due to the generally small size of the objects. Most current detectors prioritize novel detection frameworks, often overlooking research on fundamental components such as feature pyramid networks. In this paper, we introduce the Cross-Layer Feature Pyramid Transformer (CFPT), a novel upsampler-free feature pyramid network designed specifically for small object detection in aerial images. CFPT incorporates two meticulously designed attention blocks with linear computational complexity: the Cross-Layer Channel-Wise Attention (CCA) and the Cross-Layer Spatial-Wise Attention (CSA). CCA achieves cross-layer interaction by dividing channel-wise token groups to perceive cross-layer global information along the spatial dimension, while CSA completes cross-layer interaction by dividing spatial-wise token groups to perceive cross-layer global information along the channel dimension. By integrating these modules, CFPT enables cross-layer interaction in one step, thereby avoiding the semantic gap and information loss associated with element-wise summation and layer-by-layer transmission. Furthermore, CFPT incorporates global contextual information, which enhances detection performance for small objects. To further enhance location awareness during cross-layer interaction, we propose the Cross-Layer Consistent Relative Positional Encoding (CCPE) based on inter-layer mutual receptive fields. We evaluate the effectiveness of CFPT on two challenging object detection datasets in aerial images, namely VisDrone2019-DET and TinyPerson. Extensive experiments demonstrate the effectiveness of CFPT, which outperforms state-of-the-art feature pyramid networks while incurring lower computational costs. The code will be released at https://github.com/duzw9311/CFPT.

Via

Access Paper or Ask Questions

SGOR: Outlier Removal by Leveraging Semantic and Geometric Information for Robust Point Cloud Registration

Jul 08, 2024

Guiyu Zhao, Zhentao Guo, Hongbin Ma

Abstract:In this paper, we introduce a new outlier removal method that fully leverages geometric and semantic information, to achieve robust registration. Current semantic-based registration methods only use semantics for point-to-point or instance semantic correspondence generation, which has two problems. First, these methods are highly dependent on the correctness of semantics. They perform poorly in scenarios with incorrect semantics and sparse semantics. Second, the use of semantics is limited only to the correspondence generation, resulting in bad performance in the weak geometry scene. To solve these problems, on the one hand, we propose secondary ground segmentation and loose semantic consistency based on regional voting. It improves the robustness to semantic correctness by reducing the dependence on single-point semantics. On the other hand, we propose semantic-geometric consistency for outlier removal, which makes full use of semantic information and significantly improves the quality of correspondences. In addition, a two-stage hypothesis verification is proposed, which solves the problem of incorrect transformation selection in the weak geometry scene. In the outdoor dataset, our method demonstrates superior performance, boosting a 22.5 percentage points improvement in registration recall and achieving better robustness under various conditions. Our code is available.

* Accepted by IROS 2024

Via

Access Paper or Ask Questions

VRHCF: Cross-Source Point Cloud Registration via Voxel Representation and Hierarchical Correspondence Filtering

Mar 15, 2024

Guiyu Zhao, Zewen Du, Zhentao Guo, Hongbin Ma

Abstract:Addressing the challenges posed by the substantial gap in point cloud data collected from diverse sensors, achieving robust cross-source point cloud registration becomes a formidable task. In response, we present a novel framework for point cloud registration with broad applicability, suitable for both homologous and cross-source registration scenarios. To tackle the issues arising from different densities and distributions in cross-source point cloud data, we introduce a feature representation based on spherical voxels. Furthermore, addressing the challenge of numerous outliers and mismatches in cross-source registration, we propose a hierarchical correspondence filtering approach. This method progressively filters out mismatches, yielding a set of high-quality correspondences. Our method exhibits versatile applicability and excels in both traditional homologous registration and challenging cross-source registration scenarios. Specifically, in homologous registration using the 3DMatch dataset, we achieve the highest registration recall of 95.1% and an inlier ratio of 87.8%. In cross-source point cloud registration, our method attains the best RR on the 3DCSR dataset, demonstrating a 9.3 percentage points improvement. The code is available at https://github.com/GuiyuZhao/VRHCF.

* Accepted by IEEE International Conference on Multimedia and Expo (ICME), 2024

Via

Access Paper or Ask Questions

Real-Time Object Detection in Occluded Environment with Background Cluttering Effects Using Deep Learning

Jan 02, 2024

Syed Muhammad Aamir, Hongbin Ma, Malak Abid Ali Khan, Muhammad Aaqib

Abstract:Detection of small, undetermined moving objects or objects in an occluded environment with a cluttered background is the main problem of computer vision. This greatly affects the detection accuracy of deep learning models. To overcome these problems, we concentrate on deep learning models for real-time detection of cars and tanks in an occluded environment with a cluttered background employing SSD and YOLO algorithms and improved precision of detection and reduce problems faced by these models. The developed method makes the custom dataset and employs a preprocessing technique to clean the noisy dataset. For training the developed model we apply the data augmentation technique to balance and diversify the data. We fine-tuned, trained, and evaluated these models on the established dataset by applying these techniques and highlighting the results we got more accurately than without applying these techniques. The accuracy and frame per second of the SSD-Mobilenet v2 model are higher than YOLO V3 and YOLO V4. Furthermore, by employing various techniques like data enhancement, noise reduction, parameter optimization, and model fusion we improve the effectiveness of detection and recognition. We further added a counting algorithm, and target attributes experimental comparison, and made a graphical user interface system for the developed model with features of object counting, alerts, status, resolution, and frame per second. Subsequently, to justify the importance of the developed method analysis of YOLO V3, V4, and SSD were incorporated. Which resulted in the overall completion of the proposed method.

Via

Access Paper or Ask Questions

SphereNet: Learning a Noise-Robust and General Descriptor for Point Cloud Registration

Jul 18, 2023

Guiyu Zhao, Zhentao Guo, Xin Wang, Hongbin Ma

Abstract:Point cloud registration is to estimate a transformation to align point clouds collected in different perspectives. In learning-based point cloud registration, a robust descriptor is vital for high-accuracy registration. However, most methods are susceptible to noise and have poor generalization ability on unseen datasets. Motivated by this, we introduce SphereNet to learn a noise-robust and unseen-general descriptor for point cloud registration. In our method, first, the spheroid generator builds a geometric domain based on spherical voxelization to encode initial features. Then, the spherical interpolation of the sphere is introduced to realize robustness against noise. Finally, a new spherical convolutional neural network with spherical integrity padding completes the extraction of descriptors, which reduces the loss of features and fully captures the geometric features. To evaluate our methods, a new benchmark 3DMatch-noise with strong noise is introduced. Extensive experiments are carried out on both indoor and outdoor datasets. Under high-intensity noise, SphereNet increases the feature matching recall by more than 25 percentage points on 3DMatch-noise. In addition, it sets a new state-of-the-art performance for the 3DMatch and 3DLoMatch benchmarks with 93.5\% and 75.6\% registration recall and also has the best generalization ability on unseen datasets.

* 15 pages, under review for IEEE Transactions on Circuits and Systems for Video Technology

Via

Access Paper or Ask Questions

Experimental Comparison of SNR and RSSI for LoRa-ESL Based on Machine Clustering and Arithmetic Distribution

Oct 27, 2022

Malak Abid Ali Khan, Hongbin Ma, Syed Muhammad Aamir, Cekderi Anil Baris

Abstract:LoRa lacks the sensing capabilities of channel status. Received signal strength indicator (RSSI) decreases due to collision, interference, and near-far effect while for signal-to-noise ratio (SNR), the packets are rejected by decreasing the transmission power (TP) at a higher spreading factor (SF). To overcome these challenges in the case of electric shelf label (ESL) to minimize the dependency on retransmission and acknowledgment, the end devices (EDs) are allocated around gateways (GWs) based on machine clustering with dynamic SF for SNR while dynamic TP for RSSI. The experimental results determined that the RSSI approach is more dominant than SNR because of determining the exact locality of the ED that diminished the capture effect. Arithmetic distribution of EDs for various GWs in different clusters helps to minify the near-far effect. The resultant received power (RP) at each cluster is higher for most of the connected EDs than the threshold RP.

Via

Access Paper or Ask Questions

Embedded Design of Automatic Pesticide Spraying Robot Control System

Oct 25, 2022

Ahamed Mustak, Hongbin Ma, Lepeng Song, Ying Jin

Abstract:In agriculture, crops need to apply pesticide spraying flow control precisely to reduce costs, protect the environment, and increase yield production. Although there have several variable control methods for spraying flow control because indirect control flow techniques and having a slow response could cause inaccuracy and mismanagement, also noted that those systems also suffer from complicated design and debugging, etc. In this paper, an embedded design of the fuzzy PID variable spraying control method is adopted. The experimental results show that the overshoot of Proportional Integral Derivative (PID) control is 10.76%, and the overshoot of fuzzy PID control is 7.17% which can meet the requirements of an advanced spray control flow system. For further investigation, a novel spray flow control method based on Programmable Logic Control (PLC) is proposed in this paper.

* arXiv admin note: text overlap with arXiv:1806.06762 by other authors

Via

Access Paper or Ask Questions

Analysis of OODA Loop based on Adversarial for Complex Game Environments

Mar 25, 2022

Xiangri Lu, Hongbin Ma, Zhanqing Wang

Figure 1 for Analysis of OODA Loop based on Adversarial for Complex Game Environments

Figure 2 for Analysis of OODA Loop based on Adversarial for Complex Game Environments

Figure 3 for Analysis of OODA Loop based on Adversarial for Complex Game Environments

Figure 4 for Analysis of OODA Loop based on Adversarial for Complex Game Environments

Abstract:To address the problem of imperfect confrontation strategy caused by the lack of information of game environment in the simulation of non-complete information dynamic countermeasure modeling for intelligent game, the hierarchical analysis game strategy of confrontation model based on OODA ring (Observation, Orientation, Decision, Action) theory is proposed. At the same time, taking into account the trend of unmanned future warfare, NetLogo software simulation is used to construct a dynamic derivation of the confrontation between two tanks. In the validation process, the OODA loop theory is used to describe the operation process of the complex system between red and blue sides, and the four-step cycle of observation, judgment, decision and execution is carried out according to the number of armor of both sides, and then the OODA loop system adjusts the judgment and decision time coefficients for the next confrontation cycle according to the results of the first cycle. Compared with traditional simulation methods that consider objective factors such as loss rate and support rate, the OODA-loop-based hierarchical game analysis can analyze the confrontation situation more comprehensively.

Via

Access Paper or Ask Questions