Abstract:In addition to considering the recognition difficulty caused by human posture and occlusion, it is also necessary to solve the modal differences caused by different imaging systems in the Visible-Thermal cross-modal person re-identification (VT-ReID) task. In this paper,we propose the Cross-modal Local Shortest Path and Global Enhancement (CM-LSP-GE) modules,a two-stream network based on joint learning of local and global features. The core idea of our paper is to use local feature alignment to solve occlusion problem, and to solve modal difference by strengthening global feature. Firstly, Attention-based two-stream ResNet network is designed to extract dual-modality features and map to a unified feature space. Then, to solve the cross-modal person pose and occlusion problems, the image are cut horizontally into several equal parts to obtain local features and the shortest path in local features between two graphs is used to achieve the fine-grained local feature alignment. Thirdly, a batch normalization enhancement module applies global features to enhance strategy, resulting in difference enhancement between different classes. The multi granularity loss fusion strategy further improves the performance of the algorithm. Finally, joint learning mechanism of local and global features is used to improve cross-modal person re-identification accuracy. The experimental results on two typical datasets show that our model is obviously superior to the most state-of-the-art methods. Especially, on SYSU-MM01 datasets, our model can achieve a gain of 2.89%and 7.96% in all search term of Rank-1 and mAP. The source code will be released soon.
Abstract:Deep learning techniques have shown their superior performance in dermatologist clinical inspection. Nevertheless, melanoma diagnosis is still a challenging task due to the difficulty of incorporating the useful dermatologist clinical knowledge into the learning process. In this paper, we propose a novel knowledge-aware deep framework that incorporates some clinical knowledge into collaborative learning of two important melanoma diagnosis tasks, i.e., skin lesion segmentation and melanoma recognition. Specifically, to exploit the knowledge of morphological expressions of the lesion region and also the periphery region for melanoma identification, a lesion-based pooling and shape extraction (LPSE) scheme is designed, which transfers the structure information obtained from skin lesion segmentation into melanoma recognition. Meanwhile, to pass the skin lesion diagnosis knowledge from melanoma recognition to skin lesion segmentation, an effective diagnosis guided feature fusion (DGFF) strategy is designed. Moreover, we propose a recursive mutual learning mechanism that further promotes the inter-task cooperation, and thus iteratively improves the joint learning capability of the model for both skin lesion segmentation and melanoma recognition. Experimental results on two publicly available skin lesion datasets show the effectiveness of the proposed method for melanoma analysis.
Abstract:This paper proposes a novel U-Net variant using stacked dilated convolutions for medical image segmentation (SDU-Net). SDU-Net adopts the architecture of vanilla U-Net with modifications in the encoder and decoder operations (an operation indicates all the processing for feature maps of the same resolution). Unlike vanilla U-Net which incorporates two standard convolutions in each encoder/decoder operation, SDU-Net uses one standard convolution followed by multiple dilated convolutions and concatenates all dilated convolution outputs as input to the next operation. Experiments showed that SDU-Net outperformed vanilla U-Net, attention U-Net (AttU-Net), and recurrent residual U-Net (R2U-Net) in all four tested segmentation tasks while using parameters around 40% of vanilla U-Net's, 17% of AttU-Net's, and 15% of R2U-Net's.
Abstract:Accurate segmentation of skin lesion from dermoscopic images is a crucial part of computer-aided diagnosis of melanoma. It is challenging due to the fact that dermoscopic images from different patients have non-negligible lesion variation, which causes difficulties in anatomical structure learning and consistent skin lesion delineation. In this paper, we propose a novel bi-directional dermoscopic feature learning (biDFL) framework to model the complex correlation between skin lesions and their informative context. By controlling feature information passing through two complementary directions, a substantially rich and discriminative feature representation is achieved. Specifically, we place biDFL module on the top of a CNN network to enhance high-level parsing performance. Furthermore, we propose a multi-scale consistent decision fusion (mCDF) that is capable of selectively focusing on the informative decisions generated from multiple classification layers. By analysis of the consistency of the decision at each position, mCDF automatically adjusts the reliability of decisions and thus allows a more insightful skin lesion delineation. The comprehensive experimental results show the effectiveness of the proposed method on skin lesion segmentation, achieving state-of-the-art performance consistently on two publicly available dermoscopic image databases.
Abstract:Ancient Chinese texts present an area of enormous challenge and opportunity for humanities scholars interested in exploiting computational methods to assist in the development of new insights and interpretations of culturally significant materials. In this paper we describe a collaborative effort between Indiana University and Xi'an Jiaotong University to support exploration and interpretation of a digital corpus of over 18,000 ancient Chinese documents, which we refer to as the "Handian" ancient classics corpus (H\`an di\u{a}n g\u{u} j\'i, i.e, the "Han canon" or "Chinese classics"). It contains classics of ancient Chinese philosophy, documents of historical and biographical significance, and literary works. We begin by describing the Digital Humanities context of this joint project, and the advances in humanities computing that made this project feasible. We describe the corpus and introduce our application of probabilistic topic modeling to this corpus, with attention to the particular challenges posed by modeling ancient Chinese documents. We give a specific example of how the software we have developed can be used to aid discovery and interpretation of themes in the corpus. We outline more advanced forms of computer-aided interpretation that are also made possible by the programming interface provided by our system, and the general implications of these methods for understanding the nature of meaning in these texts.