Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Edith C. -H. Ngai

PEFT-as-an-Attack! Jailbreaking Language Models during Federated Parameter-Efficient Fine-Tuning

Nov 28, 2024

Shenghui Li, Edith C. -H. Ngai, Fanghua Ye, Thiemo Voigt

Figure 1 for PEFT-as-an-Attack! Jailbreaking Language Models during Federated Parameter-Efficient Fine-Tuning

Figure 2 for PEFT-as-an-Attack! Jailbreaking Language Models during Federated Parameter-Efficient Fine-Tuning

Figure 3 for PEFT-as-an-Attack! Jailbreaking Language Models during Federated Parameter-Efficient Fine-Tuning

Figure 4 for PEFT-as-an-Attack! Jailbreaking Language Models during Federated Parameter-Efficient Fine-Tuning

Abstract:Federated Parameter-Efficient Fine-Tuning (FedPEFT) has emerged as a promising paradigm for privacy-preserving and efficient adaptation of Pre-trained Language Models (PLMs) in Federated Learning (FL) settings. It preserves data privacy by keeping the data decentralized and training the model on local devices, ensuring that raw data never leaves the user's device. Moreover, the integration of PEFT methods such as LoRA significantly reduces the number of trainable parameters compared to fine-tuning the entire model, thereby minimizing communication costs and computational overhead. Despite its potential, the security implications of FedPEFT remain underexplored. This paper introduces a novel security threat to FedPEFT, termed PEFT-as-an-Attack (PaaA), which exposes how PEFT can be exploited as an attack vector to circumvent PLMs' safety alignment and generate harmful content in response to malicious prompts. Our evaluation of PaaA reveals that with less than 1% of the model's parameters set as trainable, and a small subset of clients acting maliciously, the attack achieves an approximate 80% attack success rate using representative PEFT methods such as LoRA. To mitigate this threat, we further investigate potential defense strategies, including Robust Aggregation Schemes (RASs) and Post-PEFT Safety Alignment (PPSA). However, our empirical analysis highlights the limitations of these defenses, i.e., even the most advanced RASs, such as DnC and ClippedClustering, struggle to defend against PaaA in scenarios with highly heterogeneous data distributions. Similarly, while PPSA can reduce attack success rates to below 10%, it severely degrades the model's accuracy on the target task. Our results underscore the urgent need for more effective defense mechanisms that simultaneously ensure security and maintain the performance of the FedPEFT paradigm.

Via

Access Paper or Ask Questions

AlignGroup: Learning and Aligning Group Consensus with Member Preferences for Group Recommendation

Sep 04, 2024

Jinfeng Xu, Zheyu Chen, Jinze Li, Shuo Yang, Hewei Wang, Edith C. -H. Ngai

Figure 1 for AlignGroup: Learning and Aligning Group Consensus with Member Preferences for Group Recommendation

Figure 2 for AlignGroup: Learning and Aligning Group Consensus with Member Preferences for Group Recommendation

Figure 3 for AlignGroup: Learning and Aligning Group Consensus with Member Preferences for Group Recommendation

Figure 4 for AlignGroup: Learning and Aligning Group Consensus with Member Preferences for Group Recommendation

Abstract:Group activities are important behaviors in human society, providing personalized recommendations for groups is referred to as the group recommendation task. Existing methods can usually be categorized into two strategies to infer group preferences: 1) determining group preferences by aggregating members' personalized preferences, and 2) inferring group consensus by capturing group members' coherent decisions after common compromises. However, the former would suffer from the lack of group-level considerations, and the latter overlooks the fine-grained preferences of individual users. To this end, we propose a novel group recommendation method AlignGroup, which focuses on both group consensus and individual preferences of group members to infer the group decision-making. Specifically, AlignGroup explores group consensus through a well-designed hypergraph neural network that efficiently learns intra- and inter-group relationships. Moreover, AlignGroup innovatively utilizes a self-supervised alignment task to capture fine-grained group decision-making by aligning the group consensus with members' common preferences. Extensive experiments on two real-world datasets validate that our AlignGroup outperforms the state-of-the-art on both the group recommendation task and the user recommendation task, as well as outperforms the efficiency of most baselines.

* 10 pages, accepted by CIKM 2024

Via

Access Paper or Ask Questions

Synergizing Foundation Models and Federated Learning: A Survey

Jun 18, 2024

Shenghui Li, Fanghua Ye, Meng Fang, Jiaxu Zhao, Yun-Hin Chan, Edith C. -H. Ngai, Thiemo Voigt

Figure 1 for Synergizing Foundation Models and Federated Learning: A Survey

Figure 2 for Synergizing Foundation Models and Federated Learning: A Survey

Figure 3 for Synergizing Foundation Models and Federated Learning: A Survey

Figure 4 for Synergizing Foundation Models and Federated Learning: A Survey

Abstract:The recent development of Foundation Models (FMs), represented by large language models, vision transformers, and multimodal models, has been making a significant impact on both academia and industry. Compared with small-scale models, FMs have a much stronger demand for high-volume data during the pre-training phase. Although general FMs can be pre-trained on data collected from open sources such as the Internet, domain-specific FMs need proprietary data, posing a practical challenge regarding the amount of data available due to privacy concerns. Federated Learning (FL) is a collaborative learning paradigm that breaks the barrier of data availability from different participants. Therefore, it provides a promising solution to customize and adapt FMs to a wide range of domain-specific tasks using distributed datasets whilst preserving privacy. This survey paper discusses the potentials and challenges of synergizing FL and FMs and summarizes core techniques, future directions, and applications. A periodically updated paper collection on FM-FL is available at https://github.com/lishenghui/awesome-fm-fl.

Via

Access Paper or Ask Questions

FourierKAN-GCF: Fourier Kolmogorov-Arnold Network -- An Effective and Efficient Feature Transformation for Graph Collaborative Filtering

Jun 04, 2024

Jinfeng Xu, Zheyu Chen, Jinze Li, Shuo Yang, Wei Wang, Xiping Hu, Edith C. -H. Ngai

Figure 1 for FourierKAN-GCF: Fourier Kolmogorov-Arnold Network -- An Effective and Efficient Feature Transformation for Graph Collaborative Filtering

Figure 2 for FourierKAN-GCF: Fourier Kolmogorov-Arnold Network -- An Effective and Efficient Feature Transformation for Graph Collaborative Filtering

Figure 3 for FourierKAN-GCF: Fourier Kolmogorov-Arnold Network -- An Effective and Efficient Feature Transformation for Graph Collaborative Filtering

Figure 4 for FourierKAN-GCF: Fourier Kolmogorov-Arnold Network -- An Effective and Efficient Feature Transformation for Graph Collaborative Filtering

Abstract:Graph Collaborative Filtering (GCF) has achieved state-of-the-art performance for recommendation tasks. However, most GCF structures simplify the feature transformation and nonlinear operation during message passing in the graph convolution network (GCN). We revisit these two components and discover that a part of feature transformation and nonlinear operation during message passing in GCN can improve the representation of GCF, but increase the difficulty of training. In this work, we propose a simple and effective graph-based recommendation model called FourierKAN-GCF. Specifically, it utilizes a novel Fourier Kolmogorov-Arnold Network (KAN) to replace the multilayer perceptron (MLP) as a part of the feature transformation during message passing in GCN, which improves the representation power of GCF and is easy to train. We further employ message dropout and node dropout strategies to improve the representation power and robustness of the model. Extensive experiments on two public datasets demonstrate the superiority of FourierKAN-GCF over most state-of-the-art methods. The implementation code is available at https://github.com/Jinfeng-Xu/FKAN-GCF.

Via

Access Paper or Ask Questions

MENTOR: Multi-level Self-supervised Learning for Multimodal Recommendation

Feb 29, 2024

Jinfeng Xu, Zheyu Chen, Shuo Yang, Jinze Li, Hewei Wang, Edith C. -H. Ngai

Figure 1 for MENTOR: Multi-level Self-supervised Learning for Multimodal Recommendation

Figure 2 for MENTOR: Multi-level Self-supervised Learning for Multimodal Recommendation

Figure 3 for MENTOR: Multi-level Self-supervised Learning for Multimodal Recommendation

Figure 4 for MENTOR: Multi-level Self-supervised Learning for Multimodal Recommendation

Abstract:With the increasing multimedia information, multimodal recommendation has received extensive attention. It utilizes multimodal information to alleviate the data sparsity problem in recommendation systems, thus improving recommendation accuracy. However, the reliance on labeled data severely limits the performance of multimodal recommendation models. Recently, self-supervised learning has been used in multimodal recommendations to mitigate the label sparsity problem. Nevertheless, the state-of-the-art methods cannot avoid the modality noise when aligning multimodal information due to the large differences in the distributions of different modalities. To this end, we propose a Multi-level sElf-supervised learNing for mulTimOdal Recommendation (MENTOR) method to address the label sparsity problem and the modality alignment problem. Specifically, MENTOR first enhances the specific features of each modality using the graph convolutional network (GCN) and fuses the visual and textual modalities. It then enhances the item representation via the item semantic graph for all modalities, including the fused modality. Then, it introduces two multilevel self-supervised tasks: the multilevel cross-modal alignment task and the general feature enhancement task. The multilevel cross-modal alignment task aligns each modality under the guidance of the ID embedding from multiple levels while maintaining the historical interaction information. The general feature enhancement task enhances the general feature from both the graph and feature perspectives to improve the robustness of our model. Extensive experiments on three publicly available datasets demonstrate the effectiveness of our method. Our code is publicly available at https://github.com/Jinfeng-Xu/MENTOR.

Via

Access Paper or Ask Questions

Internal Cross-layer Gradients for Extending Homogeneity to Heterogeneity in Federated Learning

Aug 22, 2023

Yun-Hin Chan, Rui Zhou, Running Zhao, Zhihan Jiang, Edith C. -H. Ngai

Figure 1 for Internal Cross-layer Gradients for Extending Homogeneity to Heterogeneity in Federated Learning

Figure 2 for Internal Cross-layer Gradients for Extending Homogeneity to Heterogeneity in Federated Learning

Figure 3 for Internal Cross-layer Gradients for Extending Homogeneity to Heterogeneity in Federated Learning

Figure 4 for Internal Cross-layer Gradients for Extending Homogeneity to Heterogeneity in Federated Learning

Abstract:Federated learning (FL) inevitably confronts the challenge of system heterogeneity in practical scenarios. To enhance the capabilities of most model-homogeneous FL methods in handling system heterogeneity, we propose a training scheme that can extend their capabilities to cope with this challenge. In this paper, we commence our study with a detailed exploration of homogeneous and heterogeneous FL settings and discover three key observations: (1) a positive correlation between client performance and layer similarities, (2) higher similarities in the shallow layers in contrast to the deep layers, and (3) the smoother gradients distributions indicate the higher layer similarities. Building upon these observations, we propose InCo Aggregation that leverags internal cross-layer gradients, a mixture of gradients from shallow and deep layers within a server model, to augment the similarity in the deep layers without requiring additional communication between clients. Furthermore, our methods can be tailored to accommodate model-homogeneous FL methods such as FedAvg, FedProx, FedNova, Scaffold, and MOON, to expand their capabilities to handle the system heterogeneity. Copious experimental results validate the effectiveness of InCo Aggregation, spotlighting internal cross-layer gradients as a promising avenue to enhance the performance in heterogenous FL.

* Preprint. Under review

Via

Access Paper or Ask Questions

FedIN: Federated Intermediate Layers Learning for Model Heterogeneity

Apr 12, 2023

Yun-Hin Chan, Zhihan Jiang, Jing Deng, Edith C. -H. Ngai

Figure 1 for FedIN: Federated Intermediate Layers Learning for Model Heterogeneity

Figure 2 for FedIN: Federated Intermediate Layers Learning for Model Heterogeneity

Figure 3 for FedIN: Federated Intermediate Layers Learning for Model Heterogeneity

Figure 4 for FedIN: Federated Intermediate Layers Learning for Model Heterogeneity

Abstract:Federated learning (FL) facilitates edge devices to cooperatively train a global shared model while maintaining the training data locally and privately. However, a common but impractical assumption in FL is that the participating edge devices possess the same required resources and share identical global model architecture. In this study, we propose a novel FL method called Federated Intermediate Layers Learning (FedIN), supporting heterogeneous models without utilizing any public dataset. The training models in FedIN are divided into three parts, including an extractor, the intermediate layers, and a classifier. The model architectures of the extractor and classifier are the same in all devices to maintain the consistency of the intermediate layer features, while the architectures of the intermediate layers can vary for heterogeneous devices according to their resource capacities. To exploit the knowledge from features, we propose IN training, training the intermediate layers in line with the features from other clients. Additionally, we formulate and solve a convex optimization problem to mitigate the gradient divergence problem induced by the conflicts between the IN training and the local training. The experiment results show that FedIN achieves the best performance in the heterogeneous model environment compared with the state-of-the-art algorithms. Furthermore, our ablation study demonstrates the effectiveness of IN training and the solution to the convex optimization problem.

Via

Access Paper or Ask Questions

An Experimental Study of Byzantine-Robust Aggregation Schemes in Federated Learning

Feb 14, 2023

Shenghui Li, Edith C. -H. Ngai, Thiemo Voigt

Figure 1 for An Experimental Study of Byzantine-Robust Aggregation Schemes in Federated Learning

Figure 2 for An Experimental Study of Byzantine-Robust Aggregation Schemes in Federated Learning

Figure 3 for An Experimental Study of Byzantine-Robust Aggregation Schemes in Federated Learning

Figure 4 for An Experimental Study of Byzantine-Robust Aggregation Schemes in Federated Learning

Abstract:Byzantine-robust federated learning aims at mitigating Byzantine failures during the federated training process, where malicious participants may upload arbitrary local updates to the central server to degrade the performance of the global model. In recent years, several robust aggregation schemes have been proposed to defend against malicious updates from Byzantine clients and improve the robustness of federated learning. These solutions were claimed to be Byzantine-robust, under certain assumptions. Other than that, new attack strategies are emerging, striving to circumvent the defense schemes. However, there is a lack of systematic comparison and empirical study thereof. In this paper, we conduct an experimental study of Byzantine-robust aggregation schemes under different attacks using two popular algorithms in federated learning, FedSGD and FedAvg . We first survey existing Byzantine attack strategies and Byzantine-robust aggregation schemes that aim to defend against Byzantine attacks. We also propose a new scheme, ClippedClustering , to enhance the robustness of a clustering-based scheme by automatically clipping the updates. Then we provide an experimental evaluation of eight aggregation schemes in the scenario of five different Byzantine attacks. Our results show that these aggregation schemes sustain relatively high accuracy in some cases but are ineffective in others. In particular, our proposed ClippedClustering successfully defends against most attacks under independent and IID local datasets. However, when the local datasets are Non-IID, the performance of all the aggregation schemes significantly decreases. With Non-IID data, some of these aggregation schemes fail even in the complete absence of Byzantine clients. We conclude that the robustness of all the aggregation schemes is limited, highlighting the need for new defense strategies, in particular for Non-IID datasets.

* This paper has been accepted for publication in IEEE Transactions on Big Data

Via

Access Paper or Ask Questions

Exploiting Features and Logits in Heterogeneous Federated Learning

Oct 27, 2022

Yun-Hin Chan, Edith C. -H. Ngai

Abstract:Due to the rapid growth of IoT and artificial intelligence, deploying neural networks on IoT devices is becoming increasingly crucial for edge intelligence. Federated learning (FL) facilitates the management of edge devices to collaboratively train a shared model while maintaining training data local and private. However, a general assumption in FL is that all edge devices are trained on the same machine learning model, which may be impractical considering diverse device capabilities. For instance, less capable devices may slow down the updating process because they struggle to handle large models appropriate for ordinary devices. In this paper, we propose a novel data-free FL method that supports heterogeneous client models by managing features and logits, called Felo; and its extension with a conditional VAE deployed in the server, called Velo. Felo averages the mid-level features and logits from the clients at the server based on their class labels to provide the average features and logits, which are utilized for further training the client models. Unlike Felo, the server has a conditional VAE in Velo, which is used for training mid-level features and generating synthetic features according to the labels. The clients optimize their models based on the synthetic features and the average logits. We conduct experiments on two datasets and show satisfactory performances of our methods compared with the state-of-the-art methods.

Via

Access Paper or Ask Questions