Abstract:Federated learning (FL) has gained increasing attention recently, which enables distributed devices to train a common machine learning (ML) model for intelligent inference cooperatively without data sharing. However, the raw data held by various involved participators are always non-independent-and-identically-distributed (non-i.i.d), which results in slow convergence of the FL training process. To address this issue, we propose a new FL method that can significantly mitigate statistical heterogeneity by the depersonalized mechanism. Particularly, we decouple the global and local objectives optimized by performing stochastic gradient descent alternately to reduce the accumulated variance on the global model (generated in local update phases) hence accelerating the FL convergence. Then we analyze the proposed method detailedly to show the proposed method converging at a sublinear speed in the general non-convex setting. Finally, extensive numerical results are conducted with experiments on public datasets to verify the effectiveness of our proposed method.