Abstract:Directly predicting human epidermal growth factor receptor 2 (HER2) status from widely available hematoxylin and eosin (HE)-stained whole slide images (WSIs) can reduce technical costs and expedite treatment selection. Accurately predicting HER2 requires large collections of multi-site WSIs. Federated learning enables collaborative training of these WSIs without gigabyte-size WSIs transportation and data privacy concerns. However, federated learning encounters challenges in addressing label imbalance in multi-site WSIs from the real world. Moreover, existing WSI classification methods cannot simultaneously exploit local context information and long-range dependencies in the site-end feature representation of federated learning. To address these issues, we present a point transformer with federated learning for multi-site HER2 status prediction from HE-stained WSIs. Our approach incorporates two novel designs. We propose a dynamic label distribution strategy and an auxiliary classifier, which helps to establish a well-initialized model and mitigate label distribution variations across sites. Additionally, we propose a farthest cosine sampling based on cosine distance. It can sample the most distinctive features and capture the long-range dependencies. Extensive experiments and analysis show that our method achieves state-of-the-art performance at four sites with a total of 2687 WSIs. Furthermore, we demonstrate that our model can generalize to two unseen sites with 229 WSIs.
Abstract:Eye-tracking research has proven valuable in understanding numerous cognitive functions. Recently, Frey et al. provided an exciting deep learning method for learning eye movements from fMRI data. However, it needed to co-register fMRI into standard space to obtain eyeballs masks, and thus required additional templates and was time consuming. To resolve this issue, in this paper, we propose a framework named MRGazer for predicting eye gaze points from fMRI in individual space. The MRGazer consisted of eyeballs extraction module and a residual network-based eye gaze prediction. Compared to the previous method, the proposed framework skips the fMRI co-registration step, simplifies the processing protocol and achieves end-to-end eye gaze regression. The proposed method achieved superior performance in a variety of eye movement tasks than the co-registration-based method, and delivered objective results within a shorter time (~ 0.02 Seconds for each volume) than prior method (~0.3 Seconds for each volume).
Abstract:Decoding brain cognitive states from neuroimaging signals is an important topic in neuroscience. In recent years, deep neural networks (DNNs) have been recruited for multiple brain state decoding and achieved good performance. However, the open question of how to interpret the DNN black box remains unanswered. Capitalizing on advances in machine learning, we integrated attention modules into brain decoders to facilitate an in-depth interpretation of DNN channels. A 4D convolution operation was also included to extract temporo-spatial interaction within the fMRI signal. The experiments showed that the proposed model obtains a very high accuracy (97.4%) and outperforms previous researches on the 7 different task benchmarks from the Human Connectome Project (HCP) dataset. The visualization analysis further illustrated the hierarchical emergence of task-specific masks with depth. Finally, the model was retrained to regress individual traits within the HCP and to classify viewing images from the BOLD5000 dataset, respectively. Transfer learning also achieves good performance. A further visualization analysis shows that, after transfer learning, low-level attention masks remained similar to the source domain, whereas high-level attention masks changed adaptively. In conclusion, the proposed 4D model with attention module performed well and facilitated interpretation of DNNs, which is helpful for subsequent research.
Abstract:Relative location prediction in computed tomography (CT) scan images is a challenging problem. In this paper, a regression model based on one-dimensional convolutional neural networks is proposed to determine the relative location of a CT scan image both robustly and precisely. A public dataset is employed to validate the performance of the study's proposed method using a 5-fold cross validation. Experimental results demonstrate an excellent performance of the proposed model when compared with the state-of-the-art techniques, achieving a median absolute error of 1.04 cm and mean absolute error of 1.69 cm.