Abstract:In this paper, we propose a method, named BRAU-Net, to solve the pubic symphysis-fetal head segmentation task. The method adopts a U-Net-like pure Transformer architecture with bi-level routing attention and skip connections, which effectively learns local-global semantic information. The proposed BRAU-Net was evaluated on transperineal Ultrasound images dataset from the pubic symphysis-fetal head segmentation and angle of progression (FH-PS-AOP) challenge. The results demonstrate that the proposed BRAU-Net achieves comparable a final score. The codes will be available at https://github.com/Caipengzhou/BRAU-Net.
Abstract:Few sample learning (FSL) is significant and challenging in the field of machine learning. The capability of learning and generalizing from very few samples successfully is a noticeable demarcation separating artificial intelligence and human intelligence since humans can readily establish their cognition to novelty from just a single or a handful of examples whereas machine learning algorithms typically entail hundreds or thousands of supervised samples to guarantee generalization ability. Despite the long history dated back to the early 2000s and the widespread attention in recent years with booming deep learning technologies, little surveys or reviews for FSL are available until now. In this context, we extensively review 300+ papers of FSL spanning from the 2000s to 2019 and provide a timely and comprehensive survey for FSL. In this survey, we review the evolution history as well as the current progress on FSL, categorize FSL approaches into the generative model based and discriminative model based kinds in principle, and emphasize particularly on the meta learning based FSL approaches. We also summarize several recently emerging extensional topics of FSL and review the latest advances on these topics. Furthermore, we highlight the important FSL applications covering many research hotspots in computer vision, natural language processing, audio and speech, reinforcement learning and robotic, data analysis, etc. Finally, we conclude the survey with a discussion on promising trends in the hope of providing guidance and insights to follow-up researches.
Abstract:Remarkable gains in deep learning usually rely on tremendous supervised data. Ensuring the modality diversity for one object in training set is critical for the generalization of cutting-edge deep models, but it burdens human with heavy manual labor on data collection and annotation. In addition, some rare or unexpected modalities are new for the current model, causing reduced performance under such emerging modalities. Inspired by the achievements in speech recognition, psychology and behavioristics, we present a practical solution, self-reinforcing unsupervised matching (SUM), to annotate the images with 2D structure-preserving property in an emerging modality by cross-modality matching. This approach requires no any supervision in emerging modality and only one template in seen modality, providing a possible route towards continual learning.
Abstract:Crop diseases are responsible for the major production reduction and economic losses in agricultural industry world- wide. Monitoring for health status of crops is critical to control the spread of diseases and implement effective management. This paper presents an in-field automatic wheat disease diagnosis system based on a weakly super- vised deep learning framework, i.e. deep multiple instance learning, which achieves an integration of identification for wheat diseases and localization for disease areas with only image-level annotation for training images in wild conditions. Furthermore, a new in-field image dataset for wheat disease, Wheat Disease Database 2017 (WDD2017), is collected to verify the effectiveness of our system. Under two different architectures, i.e. VGG-FCN-VD16 and VGG-FCN-S, our system achieves the mean recognition accuracies of 97.95% and 95.12% respectively over 5-fold cross-validation on WDD2017, exceeding the results of 93.27% and 73.00% by two conventional CNN frameworks, i.e. VGG-CNN-VD16 and VGG-CNN-S. Experimental results demonstrate that the proposed system outperforms conventional CNN architectures on recognition accuracy under the same amount of parameters, meanwhile main- taining accurate localization for corresponding disease areas. Moreover, the proposed system has been packed into a real-time mobile app to provide support for agricultural disease diagnosis.
Abstract:Zero-shot learning (ZSL) is a challenging task aiming at recognizing novel classes without any training instances. In this paper we present a simple but high-performance ZSL approach by generating pseudo feature representations (GPFR). Given the dataset of seen classes and side information of unseen classes (e.g. attributes), we synthesize feature-level pseudo representations for novel concepts, which allows us access to the formulation of unseen class predictor. Firstly we design a Joint Attribute Feature Extractor (JAFE) to acquire understandings about attributes, then construct a cognitive repository of attributes filtered by confidence margins, and finally generate pseudo feature representations using a probability based sampling strategy to facilitate subsequent training process of class predictor. We demonstrate the effectiveness in ZSL settings and the extensibility in supervised recognition scenario of our method on a synthetic colored MNIST dataset (C-MNIST). For several popular ZSL benchmark datasets, our approach also shows compelling results on zero-shot recognition task, especially leading to tremendous improvement to state-of-the-art mAP on zero-shot retrieval task.