KooLab, Kujiale.com, Hangzhou China
Abstract:Classical federated learning (FL) enables training machine learning models without sharing data for privacy preservation, but heterogeneous data characteristic degrades the performance of the localized model. Personalized FL (PFL) addresses this by synthesizing personalized models from a global model via training on local data. Such a global model may overlook the specific information that the clients have been sampled. In this paper, we propose a novel scheme to inject personalized prior knowledge into the global model in each client, which attempts to mitigate the introduced incomplete information problem in PFL. At the heart of our proposed approach is a framework, the PFL with Bregman Divergence (pFedBreD), decoupling the personalized prior from the local objective function regularized by Bregman divergence for greater adaptability in personalized scenarios. We also relax the mirror descent (RMD) to extract the prior explicitly to provide optional strategies. Additionally, our pFedBreD is backed up by a convergence analysis. Sufficient experiments demonstrate that our method reaches the state-of-the-art performances on 5 datasets and outperforms other methods by up to 3.5% across 8 benchmarks. Extensive analyses verify the robustness and necessity of proposed designs.
Abstract:Code Large Language Models (Code LLMs) have gained significant attention in the industry due to their wide applications in the full lifecycle of software engineering. However, the effectiveness of existing models in understanding non-English inputs for multi-lingual code-related tasks is still far from well studied. This paper introduces CodeFuse-13B, an open-sourced pre-trained code LLM. It is specifically designed for code-related tasks with both English and Chinese prompts and supports over 40 programming languages. CodeFuse achieves its effectiveness by utilizing a high quality pre-training dataset that is carefully filtered by program analyzers and optimized during the training process. Extensive experiments are conducted using real-world usage scenarios, the industry-standard benchmark HumanEval-x, and the specially designed CodeFuseEval for Chinese prompts. To assess the effectiveness of CodeFuse, we actively collected valuable human feedback from the AntGroup's software development process where CodeFuse has been successfully deployed. The results demonstrate that CodeFuse-13B achieves a HumanEval pass@1 score of 37.10%, positioning it as one of the top multi-lingual code LLMs with similar parameter sizes. In practical scenarios, such as code generation, code translation, code comments, and testcase generation, CodeFuse performs better than other models when confronted with Chinese prompts.
Abstract:Federated Learning (FL) presents an innovative approach to privacy-preserving distributed machine learning and enables efficient crowd intelligence on a large scale. However, a significant challenge arises when coordinating FL with crowd intelligence which diverse client groups possess disparate objectives due to data heterogeneity or distinct tasks. To address this challenge, we propose the Federated cINN Clustering Algorithm (FCCA) to robustly cluster clients into different groups, avoiding mutual interference between clients with data heterogeneity, and thereby enhancing the performance of the global model. Specifically, FCCA utilizes a global encoder to transform each client's private data into multivariate Gaussian distributions. It then employs a generative model to learn encoded latent features through maximum likelihood estimation, which eases optimization and avoids mode collapse. Finally, the central server collects converged local models to approximate similarities between clients and thus partition them into distinct clusters. Extensive experimental results demonstrate FCCA's superiority over other state-of-the-art clustered federated learning algorithms, evaluated on various models and datasets. These results suggest that our approach has substantial potential to enhance the efficiency and accuracy of real-world federated learning tasks.
Abstract:In frequency division duplex (FDD) massive multiple-input multiple-output (mMIMO) systems, the reciprocity mismatch caused by receiver distortion seriously degrades the amplitude prediction performance of channel state information (CSI). To tackle this issue, from the perspective of distortion suppression and reciprocity calibration, a lightweight neural network-based amplitude prediction method is proposed in this paper. Specifically, with the receiver distortion at the base station (BS), conventional methods are employed to extract the amplitude feature of uplink CSI. Then, learning along the direction of the uplink wireless propagation channel, a dedicated and lightweight distortion-learning network (Dist-LeaNet) is designed to restrain the receiver distortion and calibrate the amplitude reciprocity between the uplink and downlink CSI. Subsequently, by cascading, a single hidden layer-based amplitude-prediction network (Amp-PreNet) is developed to accomplish amplitude prediction of downlink CSI based on the strong amplitude reciprocity. Simulation results show that, considering the receiver distortion in FDD systems, the proposed scheme effectively improves the amplitude prediction accuracy of downlink CSI while reducing the transmission and processing delay.
Abstract:Reducing communication overhead in federated learning (FL) is challenging but crucial for large-scale distributed privacy-preserving machine learning. While methods utilizing sparsification or others can largely lower the communication overhead, the convergence rate is also greatly compromised. In this paper, we propose a novel method, named single-step synthetic features compressor (3SFC), to achieve communication-efficient FL by directly constructing a tiny synthetic dataset based on raw gradients. Thus, 3SFC can achieve an extremely low compression rate when the constructed dataset contains only one data sample. Moreover, 3SFC's compressing phase utilizes a similarity-based objective function so that it can be optimized with just one step, thereby considerably improving its performance and robustness. In addition, to minimize the compressing error, error feedback (EF) is also incorporated into 3SFC. Experiments on multiple datasets and models suggest that 3SFC owns significantly better convergence rates compared to competing methods with lower compression rates (up to 0.02%). Furthermore, ablation studies and visualizations show that 3SFC can carry more information than competing methods for every communication round, further validating its effectiveness.
Abstract:In unmanned aerial vehicle (UAV)-assisted millimeter wave (mmWave) systems, channel state information (CSI) feedback is critical for the selection of modulation schemes, resource management, beamforming, etc. However, traditional CSI feedback methods lead to significant feedback overhead and energy consumption of the UAV transmitter, therefore shortening the system operation time. To tackle these issues, inspired by superimposed feedback and integrated sensing and communications (ISAC), a line of sight (LoS) sensing-based superimposed CSI feedback scheme is proposed. Specifically, on the UAV transmitter side, the ground-to-UAV (G2U) CSI is superimposed on the UAVto-ground (U2G) data to feed back to the ground base station (gBS). At the gBS, the dedicated LoS sensing network (LoSSenNet) is designed to sense the U2G CSI in LoS and NLoS scenarios. With the sensed result of LoS-SenNet, the determined G2U CSI from the initial feature extraction will work as the priori information to guide the subsequent operation. Specifically, for the G2U CSI in NLoS, a CSI recovery network (CSI-RecNet) and superimposed interference cancellation are developed to recover the G2U CSI and U2G data. As for the LoS scenario, a dedicated LoS aid network (LoS-AidNet) is embedded before the CSI-RecNet and the block of superimposed interference cancellation to highlight the feature of the G2U CSI. Compared with other methods of superimposed CSI feedback, simulation results demonstrate that the proposed feedback scheme effectively improves the recovery accuracy of the G2U CSI and U2G data. Besides, against parameter variations, the proposed feedback scheme presents its robustness.
Abstract:Federated learning (FL for simplification) is a distributed machine learning technique that utilizes global servers and collaborative clients to achieve privacy-preserving global model training without direct data sharing. However, heterogeneous data problem, as one of FL's main problems, makes it difficult for the global model to perform effectively on each client's local data. Thus, personalized federated learning (PFL for simplification) aims to improve the performance of the model on local data as much as possible. Bayesian learning, where the parameters of the model are seen as random variables with a prior assumption, is a feasible solution to the heterogeneous data problem due to the tendency that the more local data the model use, the more it focuses on the local data, otherwise focuses on the prior. When Bayesian learning is applied to PFL, the global model provides global knowledge as a prior to the local training process. In this paper, we employ Bayesian learning to model PFL by assuming a prior in the scaled exponential family, and therefore propose pFedBreD, a framework to solve the problem we model using Bregman divergence regularization. Empirically, our experiments show that, under the prior assumption of the spherical Gaussian and the first order strategy of mean selection, our proposal significantly outcompetes other PFL algorithms on multiple public benchmarks.
Abstract:Federated learning (FL) is identified as a crucial enabler for large-scale distributed machine learning (ML) without the need for local raw dataset sharing, substantially reducing privacy concerns and alleviating the isolated data problem. In reality, the prosperity of FL is largely due to a centralized framework called FedAvg, in which workers are in charge of model training and servers are in control of model aggregation. However, FedAvg's centralized worker-server architecture has raised new concerns, be it the low scalability of the cluster, the risk of data leakage, and the failure or even defection of the central server. To overcome these problems, we propose Decentralized Federated Trusted Averaging (DeFTA), a decentralized FL framework that serves as a plug-and-play replacement for FedAvg, instantly bringing better security, scalability, and fault-tolerance to the federated learning process after installation. In principle, it fundamentally resolves the above-mentioned issues from an architectural perspective without compromises or tradeoffs, primarily consisting of a new model aggregating formula with theoretical performance analysis, and a decentralized trust system (DTS) to greatly improve system robustness. Note that since DeFTA is an alternative to FedAvg at the framework level, \textit{prevalent algorithms published for FedAvg can be also utilized in DeFTA with ease}. Extensive experiments on six datasets and six basic models suggest that DeFTA not only has comparable performance with FedAvg in a more realistic setting, but also achieves great resilience even when 66% of workers are malicious. Furthermore, we also present an asynchronous variant of DeFTA to endow it with more powerful usability.
Abstract:In frequency-division duplexing (FDD) massive multiple-input multiple-output (MIMO) systems, 1-bit compressed sensing (CS)-based superimposed channel state information (CSI) feedback has shown many advantages, while still faces many challenges, such as low accuracy of the downlink CSI recovery and large processing delays. To overcome these drawbacks, this paper proposes a deep learning (DL) scheme to improve the 1-bit compressed sensing-based superimposed CSI feedback. On the user side, the downlink CSI is compressed with the 1-bit CS technique, superimposed on the uplink user data sequences (UL-US), and then sent back to the base station (BS). At the BS, based on the model-driven approach and assisted by the superimposition-interference cancellation technology, a multi-task detection network is first constructed for detecting both the UL-US and downlink CSI. In particular, this detection network is jointly trained to detect the UL-US and downlink CSI simultaneously, capturing a globally optimized network parameter. Then, with the recovered bits for the downlink CSI, a lightweight reconstruction scheme, which consists of an initial feature extraction of the downlink CSI with the simplified traditional method and a single hidden layer network, is utilized to reconstruct the downlink CSI with low processing delay. Compared with the 1-bit CS-based superimposed CSI feedback scheme, the proposed scheme improves the recovery accuracy of the UL-US and downlink CSI with lower processing delay and possesses robustness against parameter variations.
Abstract:Due to the discarding of downlink channel state information (CSI) amplitude and the employing of iteration reconstruction algorithms, 1-bit compressed sensing (CS)-based superimposed CSI feedback is challenged by low recovery accuracy and large processing delay. To overcome these drawbacks, this letter proposes a fusion learning scheme by exploiting the bi-directional channel reciprocity. Specifically, a simplified version of the conventional downlink CSI reconstruction is utilized to extract the initial feature of downlink CSI, and a single hidden layer-based amplitude-learning network (AMPL-NET) is designed to learn the auxiliary feature of the downlink CSI amplitude. Then, based on the extracted and learned amplitude features, a simple but effective amplitude-fusion network (AMPF-NET) is developed to perform the amplitude fusion of downlink CSI and thus improves the reconstruction accuracy for 1-bit CS-based superimposed CSI feedback while reducing the processing delay. Simulation results show the effectiveness of the proposed feedback scheme and the robustness against parameter variations.