Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Gustav A. Baumgart

Not All Federated Learning Algorithms Are Created Equal: A Performance Evaluation Study

Mar 26, 2024

Gustav A. Baumgart, Jaemin Shin, Ali Payani, Myungjin Lee, Ramana Rao Kompella

Figure 1 for Not All Federated Learning Algorithms Are Created Equal: A Performance Evaluation Study

Figure 2 for Not All Federated Learning Algorithms Are Created Equal: A Performance Evaluation Study

Figure 3 for Not All Federated Learning Algorithms Are Created Equal: A Performance Evaluation Study

Figure 4 for Not All Federated Learning Algorithms Are Created Equal: A Performance Evaluation Study

Abstract:Federated Learning (FL) emerged as a practical approach to training a model from decentralized data. The proliferation of FL led to the development of numerous FL algorithms and mechanisms. Many prior efforts have given their primary focus on accuracy of those approaches, but there exists little understanding of other aspects such as computational overheads, performance and training stability, etc. To bridge this gap, we conduct extensive performance evaluation on several canonical FL algorithms (FedAvg, FedProx, FedYogi, FedAdam, SCAFFOLD, and FedDyn) by leveraging an open-source federated learning framework called Flame. Our comprehensive measurement study reveals that no single algorithm works best across different performance metrics. A few key observations are: (1) While some state-of-the-art algorithms achieve higher accuracy than others, they incur either higher computation overheads (FedDyn) or communication overheads (SCAFFOLD). (2) Recent algorithms present smaller standard deviation in accuracy across clients than FedAvg, indicating that the advanced algorithms' performances are stable. (3) However, algorithms such as FedDyn and SCAFFOLD are more prone to catastrophic failures without the support of additional techniques such as gradient clipping. We hope that our empirical study can help the community to build best practices in evaluating FL algorithms.

Via

Access Paper or Ask Questions