Abstract:Crop yield production could be enhanced for agricultural growth if various plant nutrition deficiencies, and diseases are identified and detected at early stages. The deep learning methods have proven its superior performances in the automated detection of plant diseases and nutrition deficiencies from visual symptoms in leaves. This article proposes a new deep learning method for plant nutrition deficiencies and disease classification using a graph convolutional network (GNN), added upon a base convolutional neural network (CNN). Sometimes, a global feature descriptor might fail to capture the vital region of a diseased leaf, which causes inaccurate classification of disease. To address this issue, regional feature learning is crucial for a holistic feature aggregation. In this work, region-based feature summarization at multi-scales is explored using spatial pyramidal pooling for discriminative feature representation. A GCN is developed to capacitate learning of finer details for classifying plant diseases and insufficiency of nutrients. The proposed method, called Plant Nutrition Deficiency and Disease Network (PND-Net), is evaluated on two public datasets for nutrition deficiency, and two for disease classification using four CNNs. The best classification performances are: (a) 90.00% Banana and 90.54% Coffee nutrition deficiency; and (b) 96.18% Potato diseases and 84.30% on PlantDoc datasets using Xception backbone. Furthermore, additional experiments have been carried out for generalization, and the proposed method has achieved state-of-the-art performances on two public datasets, namely the Breast Cancer Histopathology Image Classification (BreakHis 40X: 95.50%, and BreakHis 100X: 96.79% accuracy) and Single cells in Pap smear images for cervical cancer classification (SIPaKMeD: 99.18% accuracy). Also, PND-Net achieves improved performances using five-fold cross validation.
Abstract:Deep Convolutional Neural Networks (CNNs) have facilitated remarkable success in recognizing various food items and agricultural stress. A decent performance boost has been witnessed in solving the agro-food challenges by mining and analyzing of region-based partial feature descriptors. Also, computationally expensive ensemble learning schemes using multiple CNNs have been studied in earlier works. This work proposes a region attention scheme for modelling long-range dependencies by building a correlation among different regions within an input image. The attention method enhances feature representation by learning the usefulness of context information from complementary regions. Spatial pyramidal pooling and average pooling pair aggregate partial descriptors into a holistic representation. Both pooling methods establish spatial and channel-wise relationships without incurring extra parameters. A context gating scheme is applied to refine the descriptiveness of weighted attentional features, which is relevant for classification. The proposed Region Attention network for Food items and Agricultural stress recognition method, dubbed RAFA-Net, has been experimented on three public food datasets, and has achieved state-of-the-art performances with distinct margins. The highest top-1 accuracies of RAFA-Net are 91.69%, 91.56%, and 96.97% on the UECFood-100, UECFood-256, and MAFood-121 datasets, respectively. In addition, better accuracies have been achieved on two benchmark agricultural stress datasets. The best top-1 accuracies on the Insect Pest (IP-102) and PlantDoc-27 plant disease datasets are 92.36%, and 85.54%, respectively; implying RAFA-Net's generalization capability.
Abstract:Human body-pose estimation is a complex problem in computer vision. Recent research interests have been widened specifically on the Sports, Yoga, and Dance (SYD) postures for maintaining health conditions. The SYD pose categories are regarded as a fine-grained image classification task due to the complex movement of body parts. Deep Convolutional Neural Networks (CNNs) have attained significantly improved performance in solving various human body-pose estimation problems. Though decent progress has been achieved in yoga postures recognition using deep learning techniques, fine-grained sports, and dance recognition necessitates ample research attention. However, no benchmark public image dataset with sufficient inter-class and intra-class variations is available yet to address sports and dance postures classification. To solve this limitation, we have proposed two image datasets, one for 102 sport categories and another for 12 dance styles. Two public datasets, Yoga-82 which contains 82 classes and Yoga-107 represents 107 classes are collected for yoga postures. These four SYD datasets are experimented with the proposed deep model, SYD-Net, which integrates a patch-based attention (PbA) mechanism on top of standard backbone CNNs. The PbA module leverages the self-attention mechanism that learns contextual information from a set of uniform and multi-scale patches and emphasizes discriminative features to understand the semantic correlation among patches. Moreover, random erasing data augmentation is applied to improve performance. The proposed SYD-Net has achieved state-of-the-art accuracy on Yoga-82 using five base CNNs. SYD-Net's accuracy on other datasets is remarkable, implying its efficiency. Our Sports-102 and Dance-12 datasets are publicly available at https://sites.google.com/view/syd-net/home.