Abstract:We investigate how to select the number of communities for weighted networks without a full likelihood modeling. First, we propose a novel weighted degree-corrected stochastic block model (DCSBM), in which the mean adjacency matrix is modeled as the same as in standard DCSBM, while the variance profile matrix is assumed to be related to the mean adjacency matrix through a given variance function. Our method of selection the number of communities is based on a sequential testing framework, in each step the weighed DCSBM is fitted via some spectral clustering method. A key step is to carry out matrix scaling on the estimated variance profile matrix. The resulting scaling factors can be used to normalize the adjacency matrix, from which the testing statistic is obtained. Under mild conditions on the weighted DCSBM, our proposed procedure is shown to be consistent in estimating the true number of communities. Numerical experiments on both simulated and real network data also demonstrate the desirable empirical properties of our method.
Abstract:Relationship extraction and named entity recognition have always been considered as two distinct tasks that require different input data, labels, and models. However, both are essential for structured sentiment analysis. We believe that both tasks can be combined into a single stacked model with the same input data. We performed different experiments to find the best model to extract multiple opinion tuples from a single sentence. The opinion tuples will consist of holders, targets, and expressions. With the opinion tuples, we will be able to extract the relationship we need.
Abstract:Creating large scale high-quality annotations is a known challenge in medical imaging. In this work, based on the CycleGAN algorithm, we propose leveraging annotations from one modality to be useful in other modalities. More specifically, the proposed algorithm creates highly realistic synthetic CT images (SynCT) from prostate MR images using unpaired data sets. By using SynCT images (without segmentation labels) and MR images (with segmentation labels available), we have trained a deep segmentation network for precise delineation of prostate from real CT scans. For the generator in our CycleGAN, the cycle consistency term is used to guarantee that SynCT shares the identical manually-drawn, high-quality masks originally delineated on MR images. Further, we introduce a cost function based on structural similarity index (SSIM) to improve the anatomical similarity between real and synthetic images. For segmentation followed by the SynCT generation from CycleGAN, automatic delineation is achieved through a 2.5D Residual U-Net. Quantitative evaluation demonstrates comparable segmentation results between our SynCT and radiologist drawn masks for real CT images, solving an important problem in medical image segmentation field when ground truth annotations are not available for the modality of interest.