Abstract:Wireless federated learning (FL) relies on efficient uplink communications to aggregate model updates across distributed edge devices. Over-the-air computation (a.k.a. AirComp) has emerged as a promising approach for addressing the scalability challenge of FL over wireless links with limited communication resources. Unlike conventional methods, AirComp allows multiple edge devices to transmit uplink signals simultaneously, enabling the parameter server to directly decode the average global model. However, existing AirComp solutions are intrinsically analog, while modern wireless systems predominantly adopt digital modulations. Consequently, careful constellation designs are necessary to accurately decode the sum model updates without ambiguity. In this paper, we propose an end-to-end communication system supporting AirComp with digital modulation, aiming to overcome the challenges associated with accurate decoding of the sum signal with constellation designs. We leverage autoencoder network structures and explore the joint optimization of transmitter and receiver components. Our approach fills an important gap in the context of accurately decoding the sum signal in digital modulation-based AirComp, which can advance the deployment of FL in contemporary wireless systems.
Abstract:Federated learning (FL) is a popular distributed machine learning (ML) paradigm, but is often limited by significant communication costs and edge device computation capabilities. Federated Split Learning (FSL) preserves the parallel model training principle of FL, with a reduced device computation requirement thanks to splitting the ML model between the server and clients. However, FSL still incurs very high communication overhead due to transmitting the smashed data and gradients between the clients and the server in each global round. Furthermore, the server has to maintain separate models for every client, resulting in a significant computation and storage requirement that grows linearly with the number of clients. This paper tries to solve these two issues by proposing a communication and storage efficient federated and split learning (CSE-FSL) strategy, which utilizes an auxiliary network to locally update the client models while keeping only a single model at the server, hence avoiding the communication of gradients from the server and greatly reducing the server resource requirement. Communication cost is further reduced by only sending the smashed data in selected epochs from the clients. We provide a rigorous theoretical analysis of CSE-FSL that guarantees its convergence for non-convex loss functions. Extensive experimental results demonstrate that CSE-FSL has a significant communication reduction over existing FSL techniques while achieving state-of-the-art convergence and model accuracy, using several real-world FL tasks.