Abstract:Federated learning (FL) offers a privacy-preserving framework for distributed machine learning, enabling collaborative model training across diverse clients without centralizing sensitive data. However, statistical heterogeneity, characterized by non-independent and identically distributed (non-IID) client data, poses significant challenges, leading to model drift and poor generalization. This paper proposes a novel algorithm, pFedKD-WCL (Personalized Federated Knowledge Distillation with Weighted Combination Loss), which integrates knowledge distillation with bi-level optimization to address non-IID challenges. pFedKD-WCL leverages the current global model as a teacher to guide local models, optimizing both global convergence and local personalization efficiently. We evaluate pFedKD-WCL on the MNIST dataset and a synthetic dataset with non-IID partitioning, using multinomial logistic regression and multilayer perceptron models. Experimental results demonstrate that pFedKD-WCL outperforms state-of-the-art algorithms, including FedAvg, FedProx, Per-FedAvg, and pFedMe, in terms of accuracy and convergence speed.
Abstract:Machine Learning (ML) research has increased substantially in recent years, due to the success of predictive modeling across diverse application domains. However, well-known barriers exist when attempting to deploy ML models in high-stakes, clinical settings, including lack of model transparency (or the inability to audit the inference process), large training data requirements with siloed data sources, and complicated metrics for measuring model utility. In this work, we show empirically that including stronger baseline models in healthcare ML evaluations has important downstream effects that aid practitioners in addressing these challenges. Through a series of case studies, we find that the common practice of omitting baselines or comparing against a weak baseline model (e.g. a linear model with no optimization) obscures the value of ML methods proposed in the research literature. Using these insights, we propose some best practices that will enable practitioners to more effectively study and deploy ML models in clinical settings.