Abstract:Wireless federated learning (FL) relies on efficient uplink communications to aggregate model updates across distributed edge devices. Over-the-air computation (a.k.a. AirComp) has emerged as a promising approach for addressing the scalability challenge of FL over wireless links with limited communication resources. Unlike conventional methods, AirComp allows multiple edge devices to transmit uplink signals simultaneously, enabling the parameter server to directly decode the average global model. However, existing AirComp solutions are intrinsically analog, while modern wireless systems predominantly adopt digital modulations. Consequently, careful constellation designs are necessary to accurately decode the sum model updates without ambiguity. In this paper, we propose an end-to-end communication system supporting AirComp with digital modulation, aiming to overcome the challenges associated with accurate decoding of the sum signal with constellation designs. We leverage autoencoder network structures and explore the joint optimization of transmitter and receiver components. Our approach fills an important gap in the context of accurately decoding the sum signal in digital modulation-based AirComp, which can advance the deployment of FL in contemporary wireless systems.
Abstract:We propose a novel privacy-preserving uplink over-the-air computation (AirComp) method, termed FLORAS, for single-input single-output (SISO) wireless federated learning (FL) systems. From the communication design perspective, FLORAS eliminates the requirement of channel state information at the transmitters (CSIT) by leveraging the properties of orthogonal sequences. From the privacy perspective, we prove that FLORAS can offer both item-level and client-level differential privacy (DP) guarantees. Moreover, by adjusting the system parameters, FLORAS can flexibly achieve different DP levels at no additional cost. A novel FL convergence bound is derived which, combined with the privacy guarantees, allows for a smooth tradeoff between convergence rate and differential privacy levels. Numerical results demonstrate the advantages of FLORAS compared with the baseline AirComp method, and validate that our analytical results can guide the design of privacy-preserving FL with different tradeoff requirements on the model convergence and privacy levels.
Abstract:We propose a novel communication design, termed random orthogonalization, for federated learning (FL) in a massive multiple-input and multiple-output (MIMO) wireless system. The key novelty of random orthogonalization comes from the tight coupling of FL and two unique characteristics of massive MIMO -- channel hardening and favorable propagation. As a result, random orthogonalization can achieve natural over-the-air model aggregation without requiring transmitter side channel state information (CSI) for the uplink phase of FL, while significantly reducing the channel estimation overhead at the receiver. We extend this principle to the downlink communication phase and develop a simple but highly effective model broadcast method for FL. We also relax the massive MIMO assumption by proposing an enhanced random orthogonalization design for both uplink and downlink FL communications, that does not rely on channel hardening or favorable propagation. Theoretical analyses with respect to both communication and machine learning performance are carried out. In particular, an explicit relationship among the convergence rate, the number of clients, and the number of antennas is established. Experimental results validate the effectiveness and efficiency of random orthogonalization for FL in massive MIMO.
Abstract:Does Federated Learning (FL) work when both uplink and downlink communications have errors? How much communication noise can FL handle and what is its impact to the learning performance? This work is devoted to answering these practically important questions by explicitly incorporating both uplink and downlink noisy channels in the FL pipeline. We present several novel convergence analyses of FL over simultaneous uplink and downlink noisy communication channels, which encompass full and partial clients participation, direct model and model differential transmissions, and non-independent and identically distributed (IID) local datasets. These analyses characterize the sufficient conditions for FL over noisy channels to have the same convergence behavior as the ideal case of no communication error. More specifically, in order to maintain the O(1/T) convergence rate of FedAvg with perfect communications, the uplink and downlink signal-to-noise ratio (SNR) for direct model transmissions should be controlled such that they scale as O(t^2) where t is the index of communication rounds, but can stay constant for model differential transmissions. The key insight of these theoretical results is a "flying under the radar" principle - stochastic gradient descent (SGD) is an inherent noisy process and uplink/downlink communication noises can be tolerated as long as they do not dominate the time-varying SGD noise. We exemplify these theoretical findings with two widely adopted communication techniques - transmit power control and diversity combining - and further validating their performance advantages over the standard methods via extensive numerical experiments using several real-world FL tasks.