Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ping Zhong

Nearshore Underwater Target Detection Meets UAV-borne Hyperspectral Remote Sensing: A Novel Hybrid-level Contrastive Learning Framework and Benchmark Dataset

Feb 20, 2025

Jiahao Qi, Chuanhong Zhou, Xingyue Liu, Chen Chen, Dehui Zhu, Kangcheng Bin, Ping Zhong

Abstract:UAV-borne hyperspectral remote sensing has emerged as a promising approach for underwater target detection (UTD). However, its effectiveness is hindered by spectral distortions in nearshore environments, which compromise the accuracy of traditional hyperspectral UTD (HUTD) methods that rely on bathymetric model. These distortions lead to significant uncertainty in target and background spectra, challenging the detection process. To address this, we propose the Hyperspectral Underwater Contrastive Learning Network (HUCLNet), a novel framework that integrates contrastive learning with a self-paced learning paradigm for robust HUTD in nearshore regions. HUCLNet extracts discriminative features from distorted hyperspectral data through contrastive learning, while the self-paced learning strategy selectively prioritizes the most informative samples. Additionally, a reliability-guided clustering strategy enhances the robustness of learned representations.To evaluate the method effectiveness, we conduct a novel nearshore HUTD benchmark dataset, ATR2-HUTD, covering three diverse scenarios with varying water types and turbidity, and target types. Extensive experiments demonstrate that HUCLNet significantly outperforms state-of-the-art methods. The dataset and code will be publicly available at: https://github.com/qjh1996/HUTD

* 18pages,13figures

Via

Access Paper or Ask Questions

Environment-Driven Online LiDAR-Camera Extrinsic Calibration

Feb 02, 2025

Zhiwei Huang, Jiaqi Li, Ping Zhong, Rui Fan

Abstract:LiDAR-camera extrinsic calibration (LCEC) is the core for data fusion in computer vision. Existing methods typically rely on customized calibration targets or fixed scene types, lacking the flexibility to handle variations in sensor data and environmental contexts. This paper introduces EdO-LCEC, the first environment-driven, online calibration approach that achieves human-like adaptability. Inspired by the human perceptual system, EdO-LCEC incorporates a generalizable scene discriminator to actively interpret environmental conditions, creating multiple virtual cameras that capture detailed spatial and textural information. To overcome cross-modal feature matching challenges between LiDAR and camera, we propose dual-path correspondence matching (DPCM), which leverages both structural and textural consistency to achieve reliable 3D-2D correspondences. Our approach formulates the calibration process as a spatial-temporal joint optimization problem, utilizing global constraints from multiple views and scenes to improve accuracy, particularly in sparse or partially overlapping sensor views. Extensive experiments on real-world datasets demonstrate that EdO-LCEC achieves state-of-the-art performance, providing reliable and precise calibration across diverse, challenging environments.

Via

Access Paper or Ask Questions

Lighten CARAFE: Dynamic Lightweight Upsampling with Guided Reassemble Kernels

Oct 29, 2024

Ruigang Fu, Qingyong Hu, Xiaohu Dong, Yinghui Gao, Biao Li, Ping Zhong

Abstract:As a fundamental operation in modern machine vision models, feature upsampling has been widely used and investigated in the literatures. An ideal upsampling operation should be lightweight, with low computational complexity. That is, it can not only improve the overall performance but also not affect the model complexity. Content-aware Reassembly of Features (CARAFE) is a well-designed learnable operation to achieve feature upsampling. Albeit encouraging performance achieved, this method requires generating large-scale kernels, which brings a mass of extra redundant parameters, and inherently has limited scalability. To this end, we propose a lightweight upsampling operation, termed Dynamic Lightweight Upsampling (DLU) in this paper. In particular, it first constructs a small-scale source kernel space, and then samples the large-scale kernels from the kernel space by introducing learnable guidance offsets, hence avoiding introducing a large collection of trainable parameters in upsampling. Experiments on several mainstream vision tasks show that our DLU achieves comparable and even better performance to the original CARAFE, but with much lower complexity, e.g., DLU requires 91% fewer parameters and at least 63% fewer FLOPs (Floating Point Operations) than CARAFE in the case of 16x upsampling, but outperforms the CARAFE by 0.3% mAP in object detection. Code is available at https://github.com/Fu0511/Dynamic-Lightweight-Upsampling.

* Accepted at ICPR 2024

Via

Access Paper or Ask Questions

HyperDID: Hyperspectral Intrinsic Image Decomposition with Deep Feature Embedding

Nov 25, 2023

Zhiqiang Gong, Xian Zhou, Wen Yao, Xiaohu Zheng, Ping Zhong

Abstract:The dissection of hyperspectral images into intrinsic components through hyperspectral intrinsic image decomposition (HIID) enhances the interpretability of hyperspectral data, providing a foundation for more accurate classification outcomes. However, the classification performance of HIID is constrained by the model's representational ability. To address this limitation, this study rethinks hyperspectral intrinsic image decomposition for classification tasks by introducing deep feature embedding. The proposed framework, HyperDID, incorporates the Environmental Feature Module (EFM) and Categorical Feature Module (CFM) to extract intrinsic features. Additionally, a Feature Discrimination Module (FDM) is introduced to separate environment-related and category-related features. Experimental results across three commonly used datasets validate the effectiveness of HyperDID in improving hyperspectral image classification performance. This novel approach holds promise for advancing the capabilities of hyperspectral image analysis by leveraging deep feature embedding principles. The implementation of the proposed method could be accessed soon at https://github.com/shendu-sw/HyperDID for the sake of reproducibility.

* Submitted to IEEE TGRS

Via

Access Paper or Ask Questions

Beyond Sharing Weights in Decoupling Feature Learning Network for UAV RGB-Infrared Vehicle Re-Identification

Oct 12, 2023

Xingyue Liu, Jiahao Qi, Chen Chen, Kangcheng Bin, Ping Zhong

Abstract:Owing to the capacity of performing full-time target search, cross-modality vehicle re-identification (Re-ID) based on unmanned aerial vehicle (UAV) is gaining more attention in both video surveillance and public security. However, this promising and innovative research has not been studied sufficiently due to the data inadequacy issue. Meanwhile, the cross-modality discrepancy and orientation discrepancy challenges further aggravate the difficulty of this task. To this end, we pioneer a cross-modality vehicle Re-ID benchmark named UAV Cross-Modality Vehicle Re-ID (UCM-VeID), containing 753 identities with 16015 RGB and 13913 infrared images. Moreover, to meet cross-modality discrepancy and orientation discrepancy challenges, we present a hybrid weights decoupling network (HWDNet) to learn the shared discriminative orientation-invariant features. For the first challenge, we proposed a hybrid weights siamese network with a well-designed weight restrainer and its corresponding objective function to learn both modality-specific and modality shared information. In terms of the second challenge, three effective decoupling structures with two pretext tasks are investigated to learn orientation-invariant feature. Comprehensive experiments are carried out to validate the effectiveness of the proposed method. The dataset and codes will be released at https://github.com/moonstarL/UAV-CM-VeID.

* 13 pages, 10 figures, 64 citations, submitted to TMM

Via

Access Paper or Ask Questions

Masked Spatial-Spectral Autoencoders Are Excellent Hyperspectral Defenders

Jul 16, 2022

Jiahao Qi, Zhiqiang Gong, Xingyue Liu, Kangcheng Bin, Chen Chen, Yongqian Li, Wei Xue, Yu Zhang, Ping Zhong

Figure 1 for Masked Spatial-Spectral Autoencoders Are Excellent Hyperspectral Defenders

Figure 2 for Masked Spatial-Spectral Autoencoders Are Excellent Hyperspectral Defenders

Figure 3 for Masked Spatial-Spectral Autoencoders Are Excellent Hyperspectral Defenders

Figure 4 for Masked Spatial-Spectral Autoencoders Are Excellent Hyperspectral Defenders

Abstract:Deep learning methodology contributes a lot to the development of hyperspectral image (HSI) analysis community. However, it also makes HSI analysis systems vulnerable to adversarial attacks. To this end, we propose a masked spatial-spectral autoencoder (MSSA) in this paper under self-supervised learning theory, for enhancing the robustness of HSI analysis systems. First, a masked sequence attention learning module is conducted to promote the inherent robustness of HSI analysis systems along spectral channel. Then, we develop a graph convolutional network with learnable graph structure to establish global pixel-wise combinations.In this way, the attack effect would be dispersed by all the related pixels among each combination, and a better defense performance is achievable in spatial aspect.Finally, to improve the defense transferability and address the problem of limited labelled samples, MSSA employs spectra reconstruction as a pretext task and fits the datasets in a self-supervised manner.Comprehensive experiments over three benchmarks verify the effectiveness of MSSA in comparison with the state-of-the-art hyperspectral classification methods and representative adversarial defense strategies.

* 14 pages, 9 figures

Via

Access Paper or Ask Questions

A CNN with Noise Inclined Module and Denoise Framework for Hyperspectral Image Classification

May 25, 2022

Zhiqiang Gong, Ping Zhong, Jiahao Qi, Panhe Hu

Figure 1 for A CNN with Noise Inclined Module and Denoise Framework for Hyperspectral Image Classification

Figure 2 for A CNN with Noise Inclined Module and Denoise Framework for Hyperspectral Image Classification

Figure 3 for A CNN with Noise Inclined Module and Denoise Framework for Hyperspectral Image Classification

Figure 4 for A CNN with Noise Inclined Module and Denoise Framework for Hyperspectral Image Classification

Abstract:Deep Neural Networks have been successfully applied in hyperspectral image classification. However, most of prior works adopt general deep architectures while ignore the intrinsic structure of the hyperspectral image, such as the physical noise generation. This would make these deep models unable to generate discriminative features and provide impressive classification performance. To leverage such intrinsic information, this work develops a novel deep learning framework with the noise inclined module and denoise framework for hyperspectral image classification. First, we model the spectral signature of hyperspectral image with the physical noise model to describe the high intraclass variance of each class and great overlapping between different classes in the image. Then, a noise inclined module is developed to capture the physical noise within each object and a denoise framework is then followed to remove such noise from the object. Finally, the CNN with noise inclined module and the denoise framework is developed to obtain discriminative features and provides good classification performance of hyperspectral image. Experiments are conducted over two commonly used real-world datasets and the experimental results show the effectiveness of the proposed method. The implementation of the proposed method and other compared methods could be accessed at https://github.com/shendu-sw/noise-physical-framework.

Via

Access Paper or Ask Questions

Transferable Physical Attack against Object Detection with Separable Attention

May 19, 2022

Yu Zhang, Zhiqiang Gong, Yichuang Zhang, YongQian Li, Kangcheng Bin, Jiahao Qi, Wei Xue, Ping Zhong

Figure 1 for Transferable Physical Attack against Object Detection with Separable Attention

Figure 2 for Transferable Physical Attack against Object Detection with Separable Attention

Figure 3 for Transferable Physical Attack against Object Detection with Separable Attention

Figure 4 for Transferable Physical Attack against Object Detection with Separable Attention

Abstract:Transferable adversarial attack is always in the spotlight since deep learning models have been demonstrated to be vulnerable to adversarial samples. However, existing physical attack methods do not pay enough attention on transferability to unseen models, thus leading to the poor performance of black-box attack.In this paper, we put forward a novel method of generating physically realizable adversarial camouflage to achieve transferable attack against detection models. More specifically, we first introduce multi-scale attention maps based on detection models to capture features of objects with various resolutions. Meanwhile, we adopt a sequence of composite transformations to obtain the averaged attention maps, which could curb model-specific noise in the attention and thus further boost transferability. Unlike the general visualization interpretation methods where model attention should be put on the foreground object as much as possible, we carry out attack on separable attention from the opposite perspective, i.e. suppressing attention of the foreground and enhancing that of the background. Consequently, transferable adversarial camouflage could be yielded efficiently with our novel attention-based loss function. Extensive comparison experiments verify the superiority of our method to state-of-the-art methods.

Via

Access Paper or Ask Questions

Self-aligned Spatial Feature Extraction Network for UAV Vehicle Re-identification

Jan 08, 2022

Aihuan Yao, Jiahao Qi, Ping Zhong

Figure 1 for Self-aligned Spatial Feature Extraction Network for UAV Vehicle Re-identification

Figure 2 for Self-aligned Spatial Feature Extraction Network for UAV Vehicle Re-identification

Figure 3 for Self-aligned Spatial Feature Extraction Network for UAV Vehicle Re-identification

Figure 4 for Self-aligned Spatial Feature Extraction Network for UAV Vehicle Re-identification

Abstract:Compared with existing vehicle re-identification (ReID) tasks conducted with datasets collected by fixed surveillance cameras, vehicle ReID for unmanned aerial vehicle (UAV) is still under-explored and could be more challenging. Vehicles with the same color and type show extremely similar appearance from the UAV's perspective so that mining fine-grained characteristics becomes necessary. Recent works tend to extract distinguishing information by regional features and component features. The former requires input images to be aligned and the latter entails detailed annotations, both of which are difficult to meet in UAV application. In order to extract efficient fine-grained features and avoid tedious annotating work, this letter develops an unsupervised self-aligned network consisting of three branches. The network introduced a self-alignment module to convert the input images with variable orientations to a uniform orientation, which is implemented under the constraint of triple loss function designed with spatial features. On this basis, spatial features, obtained by vertical and horizontal segmentation methods, and global features are integrated to improve the representation ability in embedded space. Extensive experiments are conducted on UAV-VeID dataset, and our method achieves the best performance compared with recent ReID works.

Via

Access Paper or Ask Questions

Few-shot Object Detection with Self-adaptive Attention Network for Remote Sensing Images

Sep 26, 2020

Zixuan Xiao, Wei Xue, Ping Zhong

Figure 1 for Few-shot Object Detection with Self-adaptive Attention Network for Remote Sensing Images

Figure 2 for Few-shot Object Detection with Self-adaptive Attention Network for Remote Sensing Images

Figure 3 for Few-shot Object Detection with Self-adaptive Attention Network for Remote Sensing Images

Figure 4 for Few-shot Object Detection with Self-adaptive Attention Network for Remote Sensing Images

Abstract:In remote sensing field, there are many applications of object detection in recent years, which demands a great number of labeled data. However, we may be faced with some cases where only limited data are available. In this paper, we proposed a few-shot object detector which is designed for detecting novel objects provided with only a few examples. Particularly, in order to fit the object detection settings, our proposed few-shot detector concentrates on the relations that lie in the level of objects instead of the full image with the assistance of Self-Adaptive Attention Network (SAAN). The SAAN can fully leverage the object-level relations through a relation GRU unit and simultaneously attach attention on object features in a self-adaptive way according to the object-level relations to avoid some situations where the additional attention is useless or even detrimental. Eventually, the detection results are produced from the features that are added with attention and thus are able to be detected simply. The experiments demonstrate the effectiveness of the proposed method in few-shot scenes.

* arXiv admin note: text overlap with arXiv:2009.01616

Via

Access Paper or Ask Questions