Abstract:Distributed training can facilitate the processing of large medical image datasets, and improve the accuracy and efficiency of disease diagnosis while protecting patient privacy, which is crucial for achieving efficient medical image analysis and accelerating medical research progress. This paper presents an innovative approach to medical image classification, leveraging Federated Learning (FL) to address the dual challenges of data privacy and efficient disease diagnosis. Traditional Centralized Machine Learning models, despite their widespread use in medical imaging for tasks such as disease diagnosis, raise significant privacy concerns due to the sensitive nature of patient data. As an alternative, FL emerges as a promising solution by allowing the training of a collective global model across local clients without centralizing the data, thus preserving privacy. Focusing on the application of FL in Magnetic Resonance Imaging (MRI) brain tumor detection, this study demonstrates the effectiveness of the Federated Learning framework coupled with EfficientNet-B0 and the FedAvg algorithm in enhancing both privacy and diagnostic accuracy. Through a meticulous selection of preprocessing methods, algorithms, and hyperparameters, and a comparative analysis of various Convolutional Neural Network (CNN) architectures, the research uncovers optimal strategies for image classification. The experimental results reveal that EfficientNet-B0 outperforms other models like ResNet in handling data heterogeneity and achieving higher accuracy and lower loss, highlighting the potential of FL in overcoming the limitations of traditional models. The study underscores the significance of addressing data heterogeneity and proposes further research directions for broadening the applicability of FL in medical image analysis.
Abstract:The fine-grained action analysis of the existing action datasets is challenged by insufficient action categories, low fine granularities, limited modalities, and tasks. In this paper, we propose a Multi-modality and Multi-task dataset of Figure Skating (MMFS) which was collected from the World Figure Skating Championships. MMFS, which possesses action recognition and action quality assessment, captures RGB, skeleton, and is collected the score of actions from 11671 clips with 256 categories including spatial and temporal labels. The key contributions of our dataset fall into three aspects as follows. (1) Independently spatial and temporal categories are first proposed to further explore fine-grained action recognition and quality assessment. (2) MMFS first introduces the skeleton modality for complex fine-grained action quality assessment. (3) Our multi-modality and multi-task dataset encourage more action analysis models. To benchmark our dataset, we adopt RGB-based and skeleton-based baseline methods for action recognition and action quality assessment.
Abstract:To leverage deep learning for image aesthetics assessment, one critical but unsolved issue is how to seamlessly incorporate the information of image aspect ratios to learn more robust models. In this paper, an adaptive fractional dilated convolution (AFDC), which is aspect-ratio-embedded, composition-preserving and parameter-free, is developed to tackle this issue natively in convolutional kernel level. Specifically, the fractional dilated kernel is adaptively constructed according to the image aspect ratios, where the interpolation of nearest two integers dilated kernels is used to cope with the misalignment of fractional sampling. Moreover, we provide a concise formulation for mini-batch training and utilize a grouping strategy to reduce computational overhead. As a result, it can be easily implemented by common deep learning libraries and plugged into popular CNN architectures in a computation-efficient manner. Our experimental results demonstrate that our proposed method achieves state-of-the-art performance on image aesthetics assessment over the AVA dataset.
Abstract:Recommendation algorithms are widely adopted in marketplaces to help users find the items they are looking for. The sparsity of the items by user matrix and the cold-start issue in marketplaces pose challenges for the off-the-shelf matrix factorization based recommender systems. To understand user intent and tailor recommendations to their needs, we use deep learning to explore various heterogeneous data available in marketplaces. This paper summarizes five lessons we learned from experimenting with state-of-the-art deep learning recommenders at the leading Norwegian marketplace FINN.no. We design a hybrid recommender system that takes the user-generated contents of a marketplace (including text, images and meta attributes) and combines them with user behavior data such as page views and messages to provide recommendations for marketplace items. Among various tactics we experimented with, the following five show the best impact: staged training instead of end-to-end training, leveraging rich user behaviors beyond page views, using user behaviors as noisy labels to train embeddings, using transfer learning to solve the unbalanced data problem, and using attention mechanisms in the hybrid model. This system is currently running with around 20% click-through-rate in production at FINN.no and serves over one million visitors everyday.
Abstract:Recommendations are broadly used in marketplaces to match users with items relevant to their interests and needs. To understand user intent and tailor recommendations to their needs, we use deep learning to explore various heterogeneous data available in marketplaces. This paper focuses on the challenge of measuring recommender performance and summarizes the online experiment results with several promising types of deep neural network recommenders - hybrid item representation models combining features from user engagement and content, sequence-based models, and multi-armed bandit models that optimize user engagement by re-ranking proposals from multiple submodels. The recommenders are currently running in production at the leading Norwegian marketplace FINN.no and serves over one million visitors everyday.
Abstract:As renewable distributed energy resources (DERs) penetrate the power grid at an accelerating speed, it is essential for operators to have accurate solar photovoltaic (PV) energy forecasting for efficient operations and planning. Generally, observed weather data are applied in the solar PV generation forecasting model while in practice the energy forecasting is based on forecasted weather data. In this paper, a study on the uncertainty in weather forecasting for the most commonly used weather variables is presented. The forecasted weather data for six days ahead is compared with the observed data and the results of analysis are quantified by statistical metrics. In addition, the most influential weather predictors in energy forecasting model are selected. The performance of historical and observed weather data errors is assessed using a solar PV generation forecasting model. Finally, a sensitivity test is performed to identify the influential weather variables whose accurate values can significantly improve the results of energy forecasting.
Abstract:Nowadays, with the unprecedented penetration of renewable distributed energy resources (DERs), the necessity of an efficient energy forecasting model is more demanding than before. Generally, forecasting models are trained using observed weather data while the trained models are applied for energy forecasting using forecasted weather data. In this study, the performance of several commonly used forecasting methods in the presence of weather predictors with uncertainty is assessed and compared. Accordingly, both observed and forecasted weather data are collected, then the influential predictors for solar PV generation forecasting model are selected using several measures. Using observed and forecasted weather data, an analysis on the uncertainty of weather variables is represented by MAE and bootstrapping. The energy forecasting model is trained using observed weather data, and finally, the performance of several commonly used forecasting methods in solar energy forecasting is simulated and compared for a real case study.
Abstract:In this paper, a level-wise mixture model (LMM) is developed by embedding visual hierarchy with deep networks to support large-scale visual recognition (i.e., recognizing thousands or even tens of thousands of object classes), and a Bayesian approach is used to adapt a pre-trained visual hierarchy automatically to the improvements of deep features (that are used for image and object class representation) when more representative deep networks are learned along the time. Our LMM model can provide an end-to-end approach for jointly learning: (a) the deep networks to extract more discriminative deep features for image and object class representation; (b) the tree classifier for recognizing large numbers of object classes hierarchically; and (c) the visual hierarchy adaptation for achieving more accurate indexing of large numbers of object classes hierarchically. By supporting joint learning of the tree classifier, the deep networks and the visual hierarchy adaptation, our LMM algorithm can provide an effective approach for controlling inter-level error propagation effectively, thus it can achieve better accuracy rates on large-scale visual recognition. Our experiments are carried on ImageNet1K and ImageNet10K image sets, and our LMM algorithm can achieve very competitive results on both the accuracy rates and the computation efficiency as compared with the baseline methods.