Abstract:Achieving a balance between computational efficiency and detection accuracy in the realm of rotated bounding box object detection within aerial imagery is a significant challenge. While prior research has aimed at creating lightweight models that enhance computational performance and feature extraction, there remains a gap in the performance of these networks when it comes to the detection of small and multi-scale objects in remote sensing (RS) imagery. To address these challenges, we present a novel enhancement to the YOLOv8 model, tailored for oriented object detection tasks and optimized for environments with limited computational resources. Our model features a wavelet transform-based C2f module for capturing associative features and an Adaptive Scale Feature Pyramid (ASFP) module that leverages P2 layer details. Additionally, the incorporation of GhostDynamicConv significantly contributes to the model's lightweight nature, ensuring high efficiency in aerial imagery analysis. Featuring a parameter count of 21.6M, our approach provides a more efficient architectural design than DecoupleNet, which has 23.3M parameters, all while maintaining detection accuracy. On the DOTAv1.0 dataset, our model demonstrates a mean Average Precision (mAP) that is competitive with leading methods such as DecoupleNet. The model's efficiency, combined with its reduced parameter count, makes it a strong candidate for aerial object detection, particularly in resource-constrained environments.
Abstract:By integrating the local voltage-controlled magnetic anisotropy (VCMA) effect, Dzyaloshinskii-Moriya interaction (DMI) effect, and spin-orbit torque (SOT) effect, we propose a novel device structure for field-free magnetic tunnel junction (MTJ). Micromagnetic simulation shows that the device utilizes the chiral symmetry breaking caused by the DMI effect to induce a non-collinear spin texture under the influence of SOT current. This, combined with the perpendicular magnetic anisotropy (PMA) gradient generated by the local VCMA effect, enables deterministic switching of the MTJ state without an external field. The impact of variations in DMI strength and PMA gradient on the magnetization dynamics is analyzed.
Abstract:Realistic hyperspectral image (HSI) super-resolution (SR) techniques aim to generate a high-resolution (HR) HSI with higher spectral and spatial fidelity from its low-resolution (LR) counterpart. The generative adversarial network (GAN) has proven to be an effective deep learning framework for image super-resolution. However, the optimisation process of existing GAN-based models frequently suffers from the problem of mode collapse, leading to the limited capacity of spectral-spatial invariant reconstruction. This may cause the spectral-spatial distortion on the generated HSI, especially with a large upscaling factor. To alleviate the problem of mode collapse, this work has proposed a novel GAN model coupled with a latent encoder (LE-GAN), which can map the generated spectral-spatial features from the image space to the latent space and produce a coupling component to regularise the generated samples. Essentially, we treat an HSI as a high-dimensional manifold embedded in a latent space. Thus, the optimisation of GAN models is converted to the problem of learning the distributions of high-resolution HSI samples in the latent space, making the distributions of the generated super-resolution HSIs closer to those of their original high-resolution counterparts. We have conducted experimental evaluations on the model performance of super-resolution and its capability in alleviating mode collapse. The proposed approach has been tested and validated based on two real HSI datasets with different sensors (i.e. AVIRIS and UHD-185) for various upscaling factors and added noise levels, and compared with the state-of-the-art super-resolution models (i.e. HyCoNet, LTTR, BAGAN, SR- GAN, WGAN).
Abstract:Late blight disease is one of the most destructive diseases in potato crop, leading to serious yield losses globally. Accurate diagnosis of the disease at early stage is critical for precision disease control and management. Current farm practices in crop disease diagnosis are based on manual visual inspection, which is costly, time consuming, subject to individual bias. Recent advances in imaging sensors (e.g. RGB, multiple spectral and hyperspectral cameras), remote sensing and machine learning offer the opportunity to address this challenge. Particularly, hyperspectral imagery (HSI) combining with machine learning/deep learning approaches is preferable for accurately identifying specific plant diseases because the HSI consists of a wide range of high-quality reflectance information beyond human vision, capable of capturing both spectral-spatial information. The proposed method considers the potential disease specific reflectance radiation variance caused by the canopy structural diversity, introduces the multiple capsule layers to model the hierarchical structure of the spectral-spatial disease attributes with the encapsulated features to represent the various classes and the rotation invariance of the disease attributes in the feature space. We have evaluated the proposed method with the real UAV-based HSI data under the controlled field conditions. The effectiveness of the hierarchical features has been quantitatively assessed and compared with the existing representative machine learning/deep learning methods. The experiment results show that the proposed model significantly improves the accuracy performance when considering hierarchical-structure of spectral-spatial features, comparing to the existing methods only using spectral, or spatial or spectral-spatial features without consider hierarchical-structure of spectral-spatial features.
Abstract:Spectral-spatial based deep learning models have recently proven to be effective in hyperspectral image (HSI) classification for various earth monitoring applications such as land cover classification and agricultural monitoring. However, due to the nature of "black-box" model representation, how to explain and interpret the learning process and the model decision remains an open problem. This study proposes an interpretable deep learning model -- a biologically interpretable two-stage deep neural network (BIT-DNN), by integrating biochemical and biophysical associated information into the proposed framework, capable of achieving both high accuracy and interpretability on HSI based classification tasks. The proposed model introduces a two-stage feature learning process. In the first stage, an enhanced interpretable feature block extracts low-level spectral features associated with the biophysical and biochemical attributes of the target entities; and in the second stage, an interpretable capsule block extracts and encapsulates the high-level joint spectral-spatial features into the featured tensors representing the hierarchical structure of the biophysical and biochemical attributes of the target ground entities, which provides the model an improved performance on classification and intrinsic interpretability. We have tested and evaluated the model using two real HSI datasets for crop type recognition and crop disease recognition tasks and compared it with six state-of-the-art machine learning models. The results demonstrate that the proposed model has competitive advantages in terms of both classification accuracy and model interpretability.
Abstract:With the development of research on memristor, memristive neural networks (MNNs) have become a hot research topic recently. Because memristor can mimic the spike timing-dependent plasticity (STDP), the research on STDP based MNNs is rapidly increasing. However, although state-of-the-art works on STDP based MNNs have many applications such as pattern recognition, STDP mechanism brings relatively complex hardware framework and low processing speed, which block MNNs' hardware realization. A non-STDP based unsupervised MNN is constructed in this paper. Through the comparison with STDP method on the basis of two common structures including feedforward and crossbar, non-STDP based MNNs not only remain the same advantages as STDP based MNNs including high accuracy and convergence speed in pattern recognition, but also better hardware performance as few hardware resources and higher processing speed. By virtue of the combination of memristive character and simple mechanism, non-STDP based MNNs have better hardware compatibility, which may give a new viewpoint for memristive neural networks' engineering applications.
Abstract:Machine learning algorithms have been effectively applied into various real world tasks. However, it is difficult to provide high-quality machine learning solutions to accommodate an unknown distribution of input datasets; this difficulty is called the uncertainty prediction problems. In this paper, a margin-based Pareto deep ensemble pruning (MBPEP) model is proposed. It achieves the high-quality uncertainty estimation with a small value of the prediction interval width (MPIW) and a high confidence of prediction interval coverage probability (PICP) by using deep ensemble networks. In addition to these networks, unique loss functions are proposed, and these functions make the sub-learners available for standard gradient descent learning. Furthermore, the margin criterion fine-tuning-based Pareto pruning method is introduced to optimize the ensembles. Several experiments including predicting uncertainties of classification and regression are conducted to analyze the performance of MBPEP. The experimental results show that MBPEP achieves a small interval width and a low learning error with an optimal number of ensembles. For the real-world problems, MBPEP performs well on input datasets with unknown distributions datasets incomings and improves learning performance on a multi task problem when compared to that of each single model.
Abstract:Memristive neural networks (MNNs), which use memristors as neurons or synapses, have become a hot research topic recently. However, most memristors are not compatible with mainstream integrated circuit technology and their stabilities in large-scale are not very well so far. In this paper, a hardware friendly MNN circuit is introduced, in which the memristive characteristics are implemented by digital integrated circuit. Through this method, spike timing dependent plasticity (STDP) and unsupervised learning are realized. A weight sharing mechanism is proposed to bridge the gap of network scale and hardware resource. Experiment results show the hardware resource is significantly saved with it, maintaining good recognition accuracy and high speed. Moreover, the tendency of resource increase is slower than the expansion of network scale, which infers our method's potential on large scale neuromorphic network's realization.