Computer Vision and Robotics Institute, University of Girona, Spain
Abstract:The correct interpretation of breast density is important in the assessment of breast cancer risk. AI has been shown capable of accurately predicting breast density, however, due to the differences in imaging characteristics across mammography systems, models built using data from one system do not generalize well to other systems. Though federated learning (FL) has emerged as a way to improve the generalizability of AI without the need to share data, the best way to preserve features from all training data during FL is an active area of research. To explore FL methodology, the breast density classification FL challenge was hosted in partnership with the American College of Radiology, Harvard Medical School's Mass General Brigham, University of Colorado, NVIDIA, and the National Institutes of Health National Cancer Institute. Challenge participants were able to submit docker containers capable of implementing FL on three simulated medical facilities, each containing a unique large mammography dataset. The breast density FL challenge ran from June 15 to September 5, 2022, attracting seven finalists from around the world. The winning FL submission reached a linear kappa score of 0.653 on the challenge test data and 0.413 on an external testing dataset, scoring comparably to a model trained on the same data in a central location.
Abstract:Medical image synthesis has gained a great focus recently, especially after the introduction of Generative Adversarial Networks (GANs). GANs have been used widely to provide anatomically-plausible and diverse samples for augmentation and other applications, including segmentation and super resolution. In our previous work, Deep Convolutional GANs were used to generate synthetic mammogram lesions, masses mainly, that could enhance the classification performance in imbalanced datasets. In this new work, a deeper investigation was carried out to explore other aspects of the generated images evaluation, i.e., realism, feature space distribution, and observers studies. t-Stochastic Neighbor Embedding (t-SNE) was used to reduce the dimensionality of real and fake images to enable 2D visualisations. Additionally, two expert radiologists performed a realism-evaluation study. Visualisations showed that the generated images have a similar feature distribution of the real ones, avoiding outliers. Moreover, Receiver Operating Characteristic (ROC) curve showed that the radiologists could not, in many cases, distinguish between synthetic and real lesions, giving 48% and 61% accuracies in a balanced sample set.
Abstract:Early detection of breast cancer has a major contribution to curability, and using mammographic images, this can be achieved non-invasively. Supervised deep learning, the dominant CADe tool currently, has played a great role in object detection in computer vision, but it suffers from a limiting property: the need of a large amount of labelled data. This becomes stricter when it comes to medical datasets which require high-cost and time-consuming annotations. Furthermore, medical datasets are usually imbalanced, a condition that often hinders classifiers performance. The aim of this paper is to learn the distribution of the minority class to synthesise new samples in order to improve lesion detection in mammography. Deep Convolutional Generative Adversarial Networks (DCGANs) can efficiently generate breast masses. They are trained on increasing-size subsets of one mammographic dataset and used to generate diverse and realistic breast masses. The effect of including the generated images and/or applying horizontal and vertical flipping is tested in an environment where a 1:10 imbalanced dataset of masses and normal tissue patches is classified by a fully-convolutional network. A maximum of ~ 0:09 improvement of F1 score is reported by using DCGANs along with flipping augmentation over using the original images. We show that DCGANs can be used for synthesising photo-realistic breast mass patches with considerable diversity. It is demonstrated that appending synthetic images in this environment, along with flipping, outperforms the traditional augmentation method of flipping solely, offering faster improvements as a function of the training set size.