Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Nikita Fedin

Communication Compression for Byzantine Robust Learning: New Efficient Algorithms and Improved Rates

Oct 15, 2023

Ahmad Rammal, Kaja Gruntkowska, Nikita Fedin, Eduard Gorbunov, Peter Richtárik

Figure 1 for Communication Compression for Byzantine Robust Learning: New Efficient Algorithms and Improved Rates

Figure 2 for Communication Compression for Byzantine Robust Learning: New Efficient Algorithms and Improved Rates

Figure 3 for Communication Compression for Byzantine Robust Learning: New Efficient Algorithms and Improved Rates

Figure 4 for Communication Compression for Byzantine Robust Learning: New Efficient Algorithms and Improved Rates

Abstract:Byzantine robustness is an essential feature of algorithms for certain distributed optimization problems, typically encountered in collaborative/federated learning. These problems are usually huge-scale, implying that communication compression is also imperative for their resolution. These factors have spurred recent algorithmic and theoretical developments in the literature of Byzantine-robust learning with compression. In this paper, we contribute to this research area in two main directions. First, we propose a new Byzantine-robust method with compression -- Byz-DASHA-PAGE -- and prove that the new method has better convergence rate (for non-convex and Polyak-Lojasiewicz smooth optimization problems), smaller neighborhood size in the heterogeneous case, and tolerates more Byzantine workers under over-parametrization than the previous method with SOTA theoretical convergence guarantees (Byz-VR-MARINA). Secondly, we develop the first Byzantine-robust method with communication compression and error feedback -- Byz-EF21 -- along with its bidirectional compression version -- Byz-EF21-BC -- and derive the convergence rates for these methods for non-convex and Polyak-Lojasiewicz smooth case. We test the proposed methods and illustrate our theoretical findings in the numerical experiments.

* 44 pages, 8 figures

Via

Access Paper or Ask Questions

Byzantine-Robust Loopless Stochastic Variance-Reduced Gradient

Mar 08, 2023

Nikita Fedin, Eduard Gorbunov

Abstract:Distributed optimization with open collaboration is a popular field since it provides an opportunity for small groups/companies/universities, and individuals to jointly solve huge-scale problems. However, standard optimization algorithms are fragile in such settings due to the possible presence of so-called Byzantine workers -- participants that can send (intentionally or not) incorrect information instead of the one prescribed by the protocol (e.g., send anti-gradient instead of stochastic gradients). Thus, the problem of designing distributed methods with provable robustness to Byzantine workers has been receiving a lot of attention recently. In particular, several works consider a very promising way to achieve Byzantine tolerance via exploiting variance reduction and robust aggregation. The existing approaches use SAGA- and SARAH-type variance-reduced estimators, while another popular estimator -- SVRG -- is not studied in the context of Byzantine-robustness. In this work, we close this gap in the literature and propose a new method -- Byzantine-Robust Loopless Stochastic Variance Reduced Gradient (BR-LSVRG). We derive non-asymptotic convergence guarantees for the new method in the strongly convex case and compare its performance with existing approaches in numerical experiments.

* 15 pages, 2 figures. Code: https://github.com/Nikosimus/BR-LSVRG

Via

Access Paper or Ask Questions