Abstract:Currently, cortical surface registration techniques based on classical methods have been well developed. However, a key issue with classical methods is that for each pair of images to be registered, it is necessary to search for the optimal transformation in the deformation space according to a specific optimization algorithm until the similarity measure function converges, which cannot meet the requirements of real-time and high-precision in medical image registration. Researching cortical surface registration based on deep learning models has become a new direction. But so far, there are still only a few studies on cortical surface image registration based on deep learning. Moreover, although deep learning methods theoretically have stronger representation capabilities, surpassing the most advanced classical methods in registration accuracy and distortion control remains a challenge. Therefore, to address this challenge, this paper constructs a deep learning model to study the technology of cortical surface image registration. The specific work is as follows: (1) An unsupervised cortical surface registration network based on a multi-scale cascaded structure is designed, and a convolution method based on spherical harmonic transformation is introduced to register cortical surface data. This solves the problem of scale-inflexibility of spherical feature transformation and optimizes the multi-scale registration process. (2)By integrating the attention mechanism, a graph-enhenced module is introduced into the registration network, using the graph attention module to help the network learn global features of cortical surface data, enhancing the learning ability of the network. The results show that the graph attention module effectively enhances the network's ability to extract global features, and its registration results have significant advantages over other methods.
Abstract:Gland instance segmentation is an essential but challenging task in the diagnosis and treatment of adenocarcinoma. The existing models usually achieve gland instance segmentation through multi-task learning and boundary loss constraint. However, how to deal with the problems of gland adhesion and inaccurate boundary in segmenting the complex samples remains a challenge. In this work, we propose a displacement-field assisted graph energy transmitter (DFGET) framework to solve these problems. Specifically, a novel message passing manner based on anisotropic diffusion is developed to update the node features, which can distinguish the isomorphic graphs and improve the expressivity of graph nodes for complex samples. Using such graph framework, the gland semantic segmentation map and the displacement field (DF) of the graph nodes are estimated with two graph network branches. With the constraint of DF, a graph cluster module based on diffusion theory is presented to improve the intra-class feature consistency and inter-class feature discrepancy, as well as to separate the adherent glands from the semantic segmentation maps. Extensive comparison and ablation experiments on the GlaS dataset demonstrate the superiority of DFGET and effectiveness of the proposed anisotropic message passing manner and clustering method. Compared to the best comparative model, DFGET increases the object-Dice and object-F1 score by 2.5% and 3.4% respectively, while decreases the object-HD by 32.4%, achieving state-of-the-art performance.
Abstract:Multi-modal medical image fusion is essential for the precise clinical diagnosis and surgical navigation since it can merge the complementary information in multi-modalities into a single image. The quality of the fused image depends on the extracted single modality features as well as the fusion rules for multi-modal information. Existing deep learning-based fusion methods can fully exploit the semantic features of each modality, they cannot distinguish the effective low and high frequency information of each modality and fuse them adaptively. To address this issue, we propose AdaFuse, in which multimodal image information is fused adaptively through frequency-guided attention mechanism based on Fourier transform. Specifically, we propose the cross-attention fusion (CAF) block, which adaptively fuses features of two modalities in the spatial and frequency domains by exchanging key and query values, and then calculates the cross-attention scores between the spatial and frequency features to further guide the spatial-frequential information fusion. The CAF block enhances the high-frequency features of the different modalities so that the details in the fused images can be retained. Moreover, we design a novel loss function composed of structure loss and content loss to preserve both low and high frequency information. Extensive comparison experiments on several datasets demonstrate that the proposed method outperforms state-of-the-art methods in terms of both visual quality and quantitative metrics. The ablation experiments also validate the effectiveness of the proposed loss and fusion strategy.
Abstract:Automatic segmentation of breast tumors from the ultrasound images is essential for the subsequent clinical diagnosis and treatment plan. Although the existing deep learning-based methods have achieved significant progress in automatic segmentation of breast tumor, their performance on tumors with similar intensity to the normal tissues is still not pleasant, especially for the tumor boundaries. To address this issue, we propose a PBNet composed by a multilevel global perception module (MGPM) and a boundary guided module (BGM) to segment breast tumors from ultrasound images. Specifically, in MGPM, the long-range spatial dependence between the voxels in a single level feature maps are modeled, and then the multilevel semantic information is fused to promote the recognition ability of the model for non-enhanced tumors. In BGM, the tumor boundaries are extracted from the high-level semantic maps using the dilation and erosion effects of max pooling, such boundaries are then used to guide the fusion of low and high-level features. Moreover, to improve the segmentation performance for tumor boundaries, a multi-level boundary-enhanced segmentation (BS) loss is proposed. The extensive comparison experiments on both publicly available dataset and in-house dataset demonstrate that the proposed PBNet outperforms the state-of-the-art methods in terms of both qualitative visualization results and quantitative evaluation metrics, with the Dice score, Jaccard coefficient, Specificity and HD95 improved by 0.70%, 1.1%, 0.1% and 2.5% respectively. In addition, the ablation experiments validate that the proposed MGPM is indeed beneficial for distinguishing the non-enhanced tumors and the BGM as well as the BS loss are also helpful for refining the segmentation contours of the tumor.
Abstract:To promote the generalization ability of breast tumor segmentation models, as well as to improve the segmentation performance for breast tumors with smaller size, low-contrast amd irregular shape, we propose a progressive dual priori network (PDPNet) to segment breast tumors from dynamic enhanced magnetic resonance images (DCE-MRI) acquired at different sites. The PDPNet first cropped tumor regions with a coarse-segmentation based localization module, then the breast tumor mask was progressively refined by using the weak semantic priori and cross-scale correlation prior knowledge. To validate the effectiveness of PDPNet, we compared it with several state-of-the-art methods on multi-center datasets. The results showed that, comparing against the suboptimal method, the DSC, SEN, KAPPA and HD95 of PDPNet were improved 3.63\%, 8.19\%, 5.52\%, and 3.66\% respectively. In addition, through ablations, we demonstrated that the proposed localization module can decrease the influence of normal tissues and therefore improve the generalization ability of the model. The weak semantic priors allow focusing on tumor regions to avoid missing small tumors and low-contrast tumors. The cross-scale correlation priors are beneficial for promoting the shape-aware ability for irregual tumors. Thus integrating them in a unified framework improved the multi-center breast tumor segmentation performance.
Abstract:We consider a broadband over-the-air computation empowered model aggregation approach for wireless federated learning (FL) systems and propose to leverage an intelligent reflecting surface (IRS) to combat wireless fading and noise. We first investigate the conventional node-selection based framework, where a few edge nodes are dropped in model aggregation to control the aggregation error. We analyze the performance of this node-selection based framework and derive an upper bound on its performance loss, which is shown to be related to the selected edge nodes. Then, we seek to minimize the mean-squared error (MSE) between the desired global gradient parameters and the actually received ones by optimizing the selected edge nodes, their transmit equalization coefficients, the IRS phase shifts, and the receive factors of the cloud server. By resorting to the matrix lifting technique and difference-of-convex programming, we successfully transform the formulated optimization problem into a convex one and solve it using off-the-shelf solvers. To improve learning performance, we further propose a weight-selection based FL framework. In such a framework, we assign each edge node a proper weight coefficient in model aggregation instead of discarding any of them to reduce the aggregation error, i.e., amplitude alignment of the received local gradient parameters from different edge nodes is not required. We also analyze the performance of this weight-selection based framework and derive an upper bound on its performance loss, followed by minimizing the MSE via optimizing the weight coefficients of the edge nodes, their transmit equalization coefficients, the IRS phase shifts, and the receive factors of the cloud server. Furthermore, we use the MNIST dataset for simulations to evaluate the performance of both node-selection and weight-selection based FL frameworks.
Abstract:To investigate whether the pleurae, airways and vessels surrounding a nodule on non-contrast computed tomography (CT) can discriminate benign and malignant pulmonary nodules. The LIDC-IDRI dataset, one of the largest publicly available CT database, was exploited for study. A total of 1556 nodules from 694 patients were involved in statistical analysis, where nodules with average scorings <3 and >3 were respectively denoted as benign and malignant. Besides, 339 nodules from 113 patients with diagnosis ground-truth were independently evaluated. Computer algorithms were developed to segment pulmonary structures and quantify the distances to pleural surface, airways and vessels, as well as the counting number and normalized volume of airways and vessels near a nodule. Odds ratio (OR) and Chi-square (\chi^2) testing were performed to demonstrate the correlation between features of surrounding structures and nodule malignancy. A non-parametric receiver operating characteristic (ROC) analysis was conducted in logistic regression to evaluate discrimination ability of each structure. For benign and malignant groups, the average distances from nodules to pleural surface, airways and vessels are respectively (6.56, 5.19), (37.08, 26.43) and (1.42, 1.07) mm. The correlation between nodules and the counting number of airways and vessels that contact or project towards nodules are respectively (OR=22.96, \chi^2=105.04) and (OR=7.06, \chi^2=290.11). The correlation between nodules and the volume of airways and vessels are (OR=9.19, \chi^2=159.02) and (OR=2.29, \chi^2=55.89). The areas-under-curves (AUCs) for pleurae, airways and vessels are respectively 0.5202, 0.6943 and 0.6529. Our results show that malignant nodules are often surrounded by more pulmonary structures compared with benign ones, suggesting that features of these structures could be viewed as lung cancer biomarkers.
Abstract:Training convolutional neural networks (CNNs) for segmentation of pulmonary airway, artery, and vein is challenging due to sparse supervisory signals caused by the severe class imbalance between tubular targets and background. We present a CNNs-based method for accurate airway and artery-vein segmentation in non-contrast computed tomography. It enjoys superior sensitivity to tenuous peripheral bronchioles, arterioles, and venules. The method first uses a feature recalibration module to make the best use of features learned from the neural networks. Spatial information of features is properly integrated to retain relative priority of activated regions, which benefits the subsequent channel-wise recalibration. Then, attention distillation module is introduced to reinforce representation learning of tubular objects. Fine-grained details in high-resolution attention maps are passing down from one layer to its previous layer recursively to enrich context. Anatomy prior of lung context map and distance transform map is designed and incorporated for better artery-vein differentiation capacity. Extensive experiments demonstrated considerable performance gains brought by these components. Compared with state-of-the-art methods, our method extracted much more branches while maintaining competitive overall segmentation performance. Codes and models will be available later at http://www.pami.sjtu.edu.cn.
Abstract:Scalar coupling constant (SCC) plays a key role in the analysis of three-dimensional structure of organic matter, however, the traditional SCC prediction using quantum mechanical calculations is very time-consuming. To calculate SCC efficiently and accurately, we proposed a graph embedding local self-attention encoder (GELAE) model, in which, a novel invariant structure representation of the coupling system in terms of bond length, bond angle and dihedral angle was presented firstly, and then a local self-attention module embedded with the adjacent matrix of a graph was designed to extract effectively the features of coupling systems, finally, with a modified classification loss function, the SCC was predicted. To validate the superiority of the proposed method, we conducted a series of comparison experiments using different structure representations, different attention modules, and different losses. The experimental results demonstrate that, compared to the traditional chemical bond structure representations, the rotation and translation invariant structure representations proposed in this work can improve the SCC prediction accuracy; with the graph embedded local self-attention, the mean absolute error (MAE) of the prediction model in the validation set decreases from 0.1603 Hz to 0.1067 Hz; using the classification based loss function instead of the scaled regression loss, the MAE of the predicted SCC can be decreased to 0.0963 HZ, which is close to the quantum chemistry standard on CHAMPS dataset.
Abstract:A deep-learning based hybrid strategy for short-term load forecasting is presented. The strategy proposes a novel tree-based ensemble method Warm-start Gradient Tree Boosting (WGTB). Current strategies either ensemble submodels of a single type, which fail to take advantage of statistical strengths of different inference models. Or they simply sum the outputs from completely different inference models, which doesn't maximize the potential of ensemble. WGTB is thus proposed and tailored to the great disparity among different inference models in accuracy, volatility and linearity. The complete strategy integrates four different inference models (i.e., auto-regressive integrated moving average, nu support vector regression, extreme learning machine and long short-term memory neural network), both linear and nonlinear models. WGTB then ensembles their outputs by hybridizing linear estimator ElasticNet and nonlinear estimator ExtraTree via boosting algorithm. It is validated on the real historical data of a grid from State Grid Corporation of China of hourly resolution. The result demonstrates the effectiveness of the proposed strategy that hybridizes statistical strengths of both linear and nonlinear inference models.