Abstract:To effectively manage and utilize massive distributed data at the network edge, Federated Learning (FL) has emerged as a promising edge computing paradigm across data silos. However, FL still faces two challenges: system heterogeneity (i.e., the diversity of hardware resources across edge devices) and statistical heterogeneity (i.e., non-IID data). Although sparsification can extract diverse submodels for diverse clients, most sparse FL works either simply assign submodels with artificially-given rigid rules or prune partial parameters using heuristic strategies, resulting in inflexible sparsification and poor performance. In this work, we propose Learnable Personalized Sparsification for heterogeneous Federated learning (FedLPS), which achieves the learnable customization of heterogeneous sparse models with importance-associated patterns and adaptive ratios to simultaneously tackle system and statistical heterogeneity. Specifically, FedLPS learns the importance of model units on local data representation and further derives an importance-based sparse pattern with minimal heuristics to accurately extract personalized data features in non-IID settings. Furthermore, Prompt Upper Confidence Bound Variance (P-UCBV) is designed to adaptively determine sparse ratios by learning the superimposed effect of diverse device capabilities and non-IID data, aiming at resource self-adaptation with promising accuracy. Extensive experiments show that FedLPS outperforms status quo approaches in accuracy and training costs, which improves accuracy by 1.28%-59.34% while reducing running time by more than 68.80%.
Abstract:Multiple federated learning (FL) methods are proposed for traffic flow forecasting (TFF) to avoid heavy-transmission and privacy-leaking concerns resulting from the disclosure of raw data in centralized methods. However, these FL methods adopt offline learning which may yield subpar performance, when concept drift occurs, i.e., distributions of historical and future data vary. Online learning can detect concept drift during model training, thus more applicable to TFF. Nevertheless, the existing federated online learning method for TFF fails to efficiently solve the concept drift problem and causes tremendous computing and communication overhead. Therefore, we propose a novel method named Resource-Efficient Federated Online Learning (REFOL) for TFF, which guarantees prediction performance in a communication-lightweight and computation-efficient way. Specifically, we design a data-driven client participation mechanism to detect the occurrence of concept drift and determine clients' participation necessity. Subsequently, we propose an adaptive online optimization strategy, which guarantees prediction performance and meanwhile avoids meaningless model updates. Then, a graph convolution-based model aggregation mechanism is designed, aiming to assess participants' contribution based on spatial correlation without importing extra communication and computing consumption on clients. Finally, we conduct extensive experiments on real-world datasets to demonstrate the superiority of REFOL in terms of prediction improvement and resource economization.
Abstract:Recently, federated learning (FL) has achieved wide successes for diverse privacy-sensitive applications without sacrificing the sensitive private information of clients. However, the data quality of client datasets can not be guaranteed since corresponding annotations of different clients often contain complex label noise of varying degrees, which inevitably causes the performance degradation. Intuitively, the performance degradation is dominated by clients with higher noise rates since their trained models contain more misinformation from data, thus it is necessary to devise an effective optimization scheme to mitigate the negative impacts of these noisy clients. In this work, we propose a two-stage framework FedELC to tackle this complicated label noise issue. The first stage aims to guide the detection of noisy clients with higher label noise, while the second stage aims to correct the labels of noisy clients' data via an end-to-end label correction framework which is achieved by learning possible ground-truth labels of noisy clients' datasets via back propagation. We implement sixteen related methods and evaluate five datasets with three types of complicated label noise scenarios for a comprehensive comparison. Extensive experimental results demonstrate our proposed framework achieves superior performance than its counterparts for different scenarios. Additionally, we effectively improve the data quality of detected noisy clients' local datasets with our label correction framework. The code is available at https://github.com/Sprinter1999/FedELC.
Abstract:Extensive research has revealed that adversarial examples (AE) pose a significant threat to voice-controllable smart devices. Recent studies have proposed black-box adversarial attacks that require only the final transcription from an automatic speech recognition (ASR) system. However, these attacks typically involve many queries to the ASR, resulting in substantial costs. Moreover, AE-based adversarial audio samples are susceptible to ASR updates. In this paper, we identify the root cause of these limitations, namely the inability to construct AE attack samples directly around the decision boundary of deep learning (DL) models. Building on this observation, we propose ALIF, the first black-box adversarial linguistic feature-based attack pipeline. We leverage the reciprocal process of text-to-speech (TTS) and ASR models to generate perturbations in the linguistic embedding space where the decision boundary resides. Based on the ALIF pipeline, we present the ALIF-OTL and ALIF-OTA schemes for launching attacks in both the digital domain and the physical playback environment on four commercial ASRs and voice assistants. Extensive evaluations demonstrate that ALIF-OTL and -OTA significantly improve query efficiency by 97.7% and 73.3%, respectively, while achieving competitive performance compared to existing methods. Notably, ALIF-OTL can generate an attack sample with only one query. Furthermore, our test-of-time experiment validates the robustness of our approach against ASR updates.
Abstract:Federated Distillation (FD) offers an innovative approach to distributed machine learning, leveraging knowledge distillation for efficient and flexible cross-device knowledge transfer without necessitating the upload of extensive model parameters to a central server. While FD has gained popularity, its vulnerability to poisoning attacks remains underexplored. To address this gap, we previously introduced FDLA (Federated Distillation Logits Attack), a method that manipulates logits communication to mislead and degrade the performance of client models. However, the impact of FDLA on participants with different identities and the effects of malicious modifications at various stages of knowledge transfer remain unexplored. To this end, we present PCFDLA (Peak-Controlled Federated Distillation Logits Attack), an advanced and more stealthy logits poisoning attack method for FD. PCFDLA enhances the effectiveness of FDLA by carefully controlling the peak values of logits to create highly misleading yet inconspicuous modifications. Furthermore, we introduce a novel metric for better evaluating attack efficacy, demonstrating that PCFDLA maintains stealth while being significantly more disruptive to victim models compared to its predecessors. Experimental results across various datasets confirm the superior impact of PCFDLA on model accuracy, solidifying its potential threat in federated distillation systems.
Abstract:Federated Edge Learning (FEL) has emerged as a promising approach for enabling edge devices to collaboratively train machine learning models while preserving data privacy. Despite its advantages, practical FEL deployment faces significant challenges related to device constraints and device-server interactions, necessitating heterogeneous, user-adaptive model training with limited and uncertain communication. In this paper, we introduce FedCache 2.0, a novel personalized FEL architecture that simultaneously addresses these challenges. FedCache 2.0 incorporates the benefits of both dataset distillation and knowledge cache-driven federated learning by storing and organizing distilled data as knowledge in the server-side knowledge cache. Moreover, a device-centric cache sampling strategy is introduced to tailor transferred knowledge for individual devices within controlled communication bandwidth. Extensive experiments on five datasets covering image recognition, audio understanding, and mobile sensor data mining tasks demonstrate that (1) FedCache 2.0 significantly outperforms state-of-the-art methods regardless of model structures, data distributions, and modalities. (2) FedCache 2.0 can train splendid personalized on-device models with at least $\times$28.6 improvement in communication efficiency.
Abstract:On-device intelligence (ODI) enables artificial intelligence (AI) applications to run on end devices, providing real-time and customized AI inference without relying on remote servers. However, training models for on-device deployment face significant challenges due to the decentralized and privacy-sensitive nature of users' data, along with end-side constraints related to network connectivity, computation efficiency, etc. Existing training paradigms, such as cloud-based training, federated learning, and transfer learning, fail to sufficiently address these practical constraints that are prevalent for devices. To overcome these challenges, we propose Privacy-Enhanced Training-as-a-Service (PTaaS), a novel service computing paradigm that provides privacy-friendly, customized AI model training for end devices. PTaaS outsources the core training process to remote and powerful cloud or edge servers, efficiently developing customized on-device models based on uploaded anonymous queries, enhancing data privacy while reducing the computation load on individual devices. We explore the definition, goals, and design principles of PTaaS, alongside emerging technologies that support the PTaaS paradigm. An architectural scheme for PTaaS is also presented, followed by a series of open problems that set the stage for future research directions in the field of PTaaS.
Abstract:On-device intelligence (ODI) enables artificial intelligence (AI) applications to run on end devices, providing real-time and customized AI services without relying on remote servers. However, training models for on-device deployment face significant challenges due to the decentralized and privacy-sensitive nature of users' data, along with end-side constraints related to network connectivity, computation efficiency, etc. Existing training paradigms, such as cloud-based training, federated learning, and transfer learning, fail to sufficiently address these practical constraints that are prevalent for devices. To overcome these challenges, we propose Privacy-Preserving Training-as-a-Service (PTaaS), a novel service computing paradigm that provides privacy-friendly, customized AI model training for end devices. PTaaS outsources the core training process to remote and powerful cloud or edge servers, efficiently developing customized on-device models based on uploaded anonymous queries, ensuring data privacy while reducing the computation load on individual devices. We explore the definition, goals, and design principles of PTaaS, alongside emerging technologies that support the PTaaS paradigm. An architectural scheme for PTaaS is also presented, followed by a series of open problems that set the stage for future research directions in the field of PTaaS.
Abstract:Object detection in remote sensing images (RSIs) often suffers from several increasing challenges, including the large variation in object scales and the diverse-ranging context. Prior methods tried to address these challenges by expanding the spatial receptive field of the backbone, either through large-kernel convolution or dilated convolution. However, the former typically introduces considerable background noise, while the latter risks generating overly sparse feature representations. In this paper, we introduce the Poly Kernel Inception Network (PKINet) to handle the above challenges. PKINet employs multi-scale convolution kernels without dilation to extract object features of varying scales and capture local context. In addition, a Context Anchor Attention (CAA) module is introduced in parallel to capture long-range contextual information. These two components work jointly to advance the performance of PKINet on four challenging remote sensing detection benchmarks, namely DOTA-v1.0, DOTA-v1.5, HRSC2016, and DIOR-R.
Abstract:Fine-tuning techniques based on Large Pretrained Language Models (LPLMs) have been proven to significantly enhance model performance on a variety of downstream tasks and effectively control the output behaviors of LPLMs. Recent studies have proposed numerous methods for fine-tuning a small number of parameters based on open-source LPLMs, reducing the demand for computational and storage resources. Among these, reparameterization fine-tuning methods represented by LoRA (Low-Rank Adaptation) have gained popularity. We find that although these methods perform well in many aspects, there is still considerable room for improvement in terms of complex task adaptability, performance, stability, and algorithm complexity. In response to this, inspired by the idea that the functions of the brain are shaped by its geometric structure, this paper integrates this idea into LoRA technology and proposes a new matrix transformation-based reparameterization method for efficient fine-tuning, named Matrix-Transformation based Low-Rank Adaptation (MTLoRA). MTLoRA aims to dynamically alter its spatial geometric structure by applying a transformation-matrix T to perform linear transformations, such as rotation, scaling, and translation, on the task-specific parameter matrix, generating new matrix feature patterns (eigenvectors) to mimic the fundamental influence of complex geometric structure feature patterns in the brain on functions, thereby enhancing the model's performance in downstream tasks. In Natural Language Understanding (NLU) tasks, it is evaluated using the GLUE benchmark test, and the results reveal that MTLoRA achieves an overall performance increase of about 1.0% across eight tasks; in Natural Language Generation (NLG) tasks, MTLoRA improves performance by an average of 0.95% and 0.56% in the DART and WebNLG tasks, respectively.