Abstract:This paper studies power-efficient uplink transmission design for federated learning (FL) that employs over-the-air analog aggregation and multi-antenna beamforming at the server. We jointly optimize device transmit weights and receive beamforming at each FL communication round to minimize the total device transmit power while ensuring convergence in FL training. Through our convergence analysis, we establish sufficient conditions on the aggregation error to guarantee FL training convergence. Utilizing these conditions, we reformulate the power minimization problem into a unique bi-convex structure that contains a transmit beamforming optimization subproblem and a receive beamforming feasibility subproblem. Despite this unconventional structure, we propose a novel alternating optimization approach that guarantees monotonic decrease of the objective value, to allow convergence to a partial optimum. We further consider imperfect channel state information (CSI), which requires accounting for the channel estimation errors in the power minimization problem and FL convergence analysis. We propose a CSI-error-aware joint beamforming algorithm, which can substantially outperform one that does not account for channel estimation errors. Simulation with canonical classification datasets demonstrates that our proposed methods achieve significant power reduction compared to existing benchmarks across a wide range of parameter settings, while attaining the same target accuracy under the same convergence rate.
Abstract:Federated Learning (FL) with over-the-air computation is susceptible to analog aggregation error due to channel conditions and noise. Excluding devices with weak channels can reduce the aggregation error, but also decreases the amount of training data in FL. In this work, we jointly design the uplink receiver beamforming and device selection in over-the-air FL to maximize the training convergence rate. We propose a new method termed JBFDS, which takes into account the impact of receiver beamforming and device selection on the global loss function at each training round. Our simulation results with real-world image classification demonstrate that the proposed method achieves faster convergence with significantly lower computational complexity than existing alternatives.