Abstract:Employing wireless systems with dual sensing and communications functionalities is becoming critical in next generation of wireless networks. In this paper, we propose a robust design for over-the-air federated edge learning (OTA-FEEL) that leverages sensing capabilities at the parameter server (PS) to mitigate the impact of target echoes on the analog model aggregation. We first derive novel expressions for the Cramer-Rao bound of the target response and mean squared error (MSE) of the estimated global model to measure radar sensing and model aggregation quality, respectively. Then, we develop a joint scheduling and beamforming framework that optimizes the OTA-FEEL performance while keeping the sensing and communication quality, determined respectively in terms of Cramer-Rao bound and achievable downlink rate, in a desired range. The resulting scheduling problem reduces to a combinatorial mixed-integer nonlinear programming problem (MINLP). We develop a low-complexity hierarchical method based on the matching pursuit algorithm used widely for sparse recovery in the literature of compressed sensing. The proposed algorithm uses a step-wise strategy to omit the least effective devices in each iteration based on a metric that captures both the aggregation and sensing quality of the system. It further invokes alternating optimization scheme to iteratively update the downlink beamforming and uplink post-processing by marginally optimizing them in each iteration. Convergence and complexity analysis of the proposed algorithm is presented. Numerical evaluations on MNIST and CIFAR-10 datasets demonstrate the effectiveness of our proposed algorithm. The results show that by leveraging accurate sensing, the target echoes on the uplink signal can be effectively suppressed, ensuring the quality of model aggregation to remain intact despite the interference.
Abstract:In federated learning (FL), heterogeneity among the local dataset distributions of clients can result in unsatisfactory performance for some, leading to an unfair model. To address this challenge, we propose an over-the-air fair federated learning algorithm (OTA-FFL), which leverages over-the-air computation to train fair FL models. By formulating FL as a multi-objective minimization problem, we introduce a modified Chebyshev approach to compute adaptive weighting coefficients for gradient aggregation in each communication round. To enable efficient aggregation over the multiple access channel, we derive analytical solutions for the optimal transmit scalars at the clients and the de-noising scalar at the parameter server. Extensive experiments demonstrate the superiority of OTA-FFL in achieving fairness and robust performance compared to existing methods.
Abstract:Second-order methods are widely adopted to improve the convergence rate of learning algorithms. In federated learning (FL), these methods require the clients to share their local Hessian matrices with the parameter server (PS), which comes at a prohibitive communication cost. A classical solution to this issue is to approximate the global Hessian matrix from the first-order information. Unlike in idealized networks, this solution does not perform effectively in over-the-air FL settings, where the PS receives noisy versions of the local gradients. This paper introduces a novel second-order FL framework tailored for wireless channels. The pivotal innovation lies in the PS's capability to directly estimate the global Hessian matrix from the received noisy local gradients via a non-parametric method: the PS models the unknown Hessian matrix as a Gaussian process, and then uses the temporal relation between the gradients and Hessian along with the channel model to find a stochastic estimator for the global Hessian matrix. We refer to this method as Gaussian process-based Hessian modeling for wireless FL (GP-FL) and show that it exhibits a linear-quadratic convergence rate. Numerical experiments on various datasets demonstrate that GP-FL outperforms all classical baseline first and second order FL approaches.
Abstract:Over-the-air federated learning (OTA-FL) is an emerging technique to reduce the computation and communication overload at the PS caused by the orthogonal transmissions of the model updates in conventional federated learning (FL). This reduction is achieved at the expense of introducing aggregation error that can be efficiently suppressed by means of receive beamforming via large array-antennas. This paper studies OTA-FL in massive multiple-input multiple-output (MIMO) systems by considering a realistic scenario in which the edge server, despite its large antenna array, is restricted in the number of radio frequency (RF)-chains. For this setting, the beamforming for over-the-air model aggregation needs to be addressed jointly with antenna selection. This leads to an NP-hard problem due to the combinatorial nature of the optimization. We tackle this problem via two different approaches. In the first approach, we use the penalty dual decomposition (PDD) technique to develop a two-tier algorithm for joint antenna selection and beamforming. The second approach interprets the antenna selection task as a sparse recovery problem and develops two iterative joint algorithms based on the Lasso and fast iterative soft-thresholding methods. Convergence and complexity analysis is presented for all the schemes. The numerical investigations depict that the algorithms based on the sparse recovery techniques outperform the PDD-based algorithm, when the number of RF-chains at the edge server is much smaller than its array size. However, as the number of RF-chains increases, the PDD approach starts to be superior. Our simulations further depict that learning performance with all the antennas being active at the PS can be closely tracked by selecting less than 20% of the antennas at the PS.
Abstract:Classical antenna selection schemes require instantaneous channel state information (CSI). This leads to high signaling overhead in the system. This work proposes a novel joint receive antenna selection and precoding scheme for multiuser multiple-input multiple-output uplink transmission that relies only on the long-term statistics of the CSI. The proposed scheme designs the switching network and the uplink precoders, such that the expected throughput of the system in the long term is maximized. Invoking results from the random matrix theory, we derive a closed-form expression for the expected throughput of the system. We then develop a tractable iterative algorithm to tackle the throughput maximization problem, capitalizing on the alternating optimization and majorization-maximization (MM) techniques. Numerical results substantiate the efficiency of the proposed approach and its superior performance as compared with the baseline.
Abstract:This work studies a low-complexity design for reconfigurable intelligent surface (RIS)-aided multiuser multiple-input multiple-output systems. The base station (BS) applies receive antenna selection to connect a subset of its antennas to the available radio frequency chains. For this setting, the BS switching network, uplink precoders, and RIS phase-shifts are jointly designed, such that the uplink sum-rate is maximized. The principle design problem reduces to an NP-hard mixed-integer optimization. We hence invoke the weighted minimum mean squared error technique and the penalty dual decomposition method to develop a tractable iterative algorithm that approximates the optimal design effectively. Our numerical investigations verify the efficiency of the proposed algorithm and its superior performance as compared with the benchmark.
Abstract:This paper develops a class of low-complexity device scheduling algorithms for over-the-air federated learning via the method of matching pursuit. The proposed scheme tracks closely the close-to-optimal performance achieved by difference-of-convex programming, and outperforms significantly the well-known benchmark algorithms based on convex relaxation. Compared to the state-of-the-art, the proposed scheme poses a drastically lower computational load on the system: For $K$ devices and $N$ antennas at the parameter server, the benchmark complexity scales with $\left(N^2+K\right)^3 + N^6$ while the complexity of the proposed scheme scales with $K^p N^q$ for some $0 < p,q \leq 2$. The efficiency of the proposed scheme is confirmed via numerical experiments on the CIFAR-10 dataset.