Abstract:In this paper, we proposed large selective kernel and sparse attention network (LSKSANet) for remote sensing image semantic segmentation. The LSKSANet is a lightweight network that effectively combines convolution with sparse attention mechanisms. Specifically, we design large selective kernel module to decomposing the large kernel into a series of depth-wise convolutions with progressively increasing dilation rates, thereby expanding the receptive field without significantly increasing the computational burden. In addition, we introduce the sparse attention to keep the most useful self-attention values for better feature aggregation. Experimental results on the Vaihingen and Postdam datasets demonstrate the superior performance of the proposed LSKSANet over state-of-the-art methods.
Abstract:Arctic Sea Ice Concentration (SIC) is the ratio of ice-covered area to the total sea area of the Arctic Ocean, which is a key indicator for maritime activities. Nowadays, we often use passive microwave images to display SIC, but it has low spatial resolution, and most of the existing super-resolution methods of Arctic SIC don't take the integration of spatial and channel features into account and can't effectively integrate the multi-scale feature. To overcome the aforementioned issues, we propose MFM-Net for Arctic SIC super-resolution, which concurrently aggregates multi-scale information while integrating spatial and channel features. Extensive experiments on Arctic SIC dataset from the AMSR-E/AMSR-2 SIC DT-ASI products from Ocean University of China validate the effectiveness of porposed MFM-Net.
Abstract:Clustering is one of the fundamental tasks in computer vision and pattern recognition. Recently, deep clustering methods (algorithms based on deep learning) have attracted wide attention with their impressive performance. Most of these algorithms combine deep unsupervised representation learning and standard clustering together. However, the separation of representation learning and clustering will lead to suboptimal solutions because the two-stage strategy prevents representation learning from adapting to subsequent tasks (e.g., clustering according to specific cues). To overcome this issue, efforts have been made in the dynamic adaption of representation and cluster assignment, whereas current state-of-the-art methods suffer from heuristically constructed objectives with representation and cluster assignment alternatively optimized. To further standardize the clustering problem, we audaciously formulate the objective of clustering as finding a precise feature as the cue for cluster assignment. Based on this, we propose a general-purpose deep clustering framework which radically integrates representation learning and clustering into a single pipeline for the first time. The proposed framework exploits the powerful ability of recently developed generative models for learning intrinsic features, and imposes an entropy minimization on the distribution of the cluster assignment by a dedicated variational algorithm. Experimental results show that the performance of the proposed method is superior, or at least comparable to, the state-of-the-art methods on the handwritten digit recognition, fashion recognition, face recognition and object recognition benchmark datasets.
Abstract:This paper investigates a novel task of generating texture images from perceptual descriptions. Previous work on texture generation focused on either synthesis from examples or generation from procedural models. Generating textures from perceptual attributes have not been well studied yet. Meanwhile, perceptual attributes, such as directionality, regularity and roughness are important factors for human observers to describe a texture. In this paper, we propose a joint deep network model that combines adversarial training and perceptual feature regression for texture generation, while only random noise and user-defined perceptual attributes are required as input. In this model, a preliminary trained convolutional neural network is essentially integrated with the adversarial framework, which can drive the generated textures to possess given perceptual attributes. An important aspect of the proposed model is that, if we change one of the input perceptual features, the corresponding appearance of the generated textures will also be changed. We design several experiments to validate the effectiveness of the proposed method. The results show that the proposed method can produce high quality texture images with desired perceptual properties.
Abstract:In this paper, we propose a novel unsupervised deep learning model, called PCA-based Convolutional Network (PCN). The architecture of PCN is composed of several feature extraction stages and a nonlinear output stage. Particularly, each feature extraction stage includes two layers: a convolutional layer and a feature pooling layer. In the convolutional layer, the filter banks are simply learned by PCA. In the nonlinear output stage, binary hashing is applied. For the higher convolutional layers, the filter banks are learned from the feature maps that were obtained in the previous stage. To test PCN, we conducted extensive experiments on some challenging tasks, including handwritten digits recognition, face recognition and texture classification. The results show that PCN performs competitive with or even better than state-of-the-art deep learning models. More importantly, since there is no back propagation for supervised finetuning, PCN is much more efficient than existing deep networks.