Abstract:Deep neural networks face several challenges in hyperspectral image classification, including insufficient utilization of joint spatial-spectral information, gradient vanishing with increasing depth, and overfitting. To enhance feature extraction efficiency while skipping redundant information, this paper proposes a dynamic attention convolution design based on an improved 3D-DenseNet model. The design employs multiple parallel convolutional kernels instead of a single kernel and assigns dynamic attention weights to these parallel convolutions. This dynamic attention mechanism achieves adaptive feature response based on spatial characteristics in the spatial dimension of hyperspectral images, focusing more on key spatial structures. In the spectral dimension, it enables dynamic discrimination of different bands, alleviating information redundancy and computational complexity caused by high spectral dimensionality. The DAC module enhances model representation capability by attention-based aggregation of multiple convolutional kernels without increasing network depth or width. The proposed method demonstrates superior performance in both inference speed and accuracy, outperforming mainstream hyperspectral image classification methods on the IN, UP, and KSC datasets.
Abstract:Deep neural networks have faced many problems in hyperspectral image classification, including the ineffective utilization of spectral-spatial joint information and the problems of gradient vanishing and overfitting that arise with increasing depth. In order to accelerate the deployment of models on edge devices with strict latency requirements and limited computing power, this paper proposes a learnable group convolution network (LGCNet) based on an improved 3D-DenseNet model and a lightweight model design. The LGCNet module improves the shortcomings of group convolution by introducing a dynamic learning method for the input channels and convolution kernel grouping, enabling flexible grouping structures and generating better representation ability. Through the overall loss and gradient of the backpropagation network, the 3D group convolution is dynamically determined and updated in an end-to-end manner. The learnable number of channels and corresponding grouping can capture different complementary visual features of input images, allowing the CNN to learn richer feature representations. When extracting high-dimensional and redundant hyperspectral data, the 3D convolution kernels also contain a large amount of redundant information. The LGC module allows the 3D-DenseNet to choose channel information with more semantic features, and is very efficient, making it suitable for embedding in any deep neural network for acceleration and efficiency improvements. LGC enables the 3D-CNN to achieve sufficient feature extraction while also meeting speed and computing requirements. Furthermore, LGCNet has achieved progress in inference speed and accuracy, and outperforms mainstream hyperspectral image classification methods on the Indian Pines, Pavia University, and KSC datasets.