Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Marcin Osial

Parameter-Efficient Interventions for Enhanced Model Merging

Dec 22, 2024

Marcin Osial, Daniel Marczak, Bartosz Zieliński

Figure 1 for Parameter-Efficient Interventions for Enhanced Model Merging

Figure 2 for Parameter-Efficient Interventions for Enhanced Model Merging

Figure 3 for Parameter-Efficient Interventions for Enhanced Model Merging

Figure 4 for Parameter-Efficient Interventions for Enhanced Model Merging

Abstract:Model merging combines knowledge from task-specific models into a unified multi-task model to avoid joint training on all task data. However, current methods face challenges due to representation bias, which can interfere with tasks performance. As a remedy, we propose IntervMerge, a novel approach to multi-task model merging that effectively mitigates representation bias across the model using taskspecific interventions. To further enhance its efficiency, we introduce mini-interventions, which modify only part of the representation, thereby reducing the additional parameters without compromising performance. Experimental results demonstrate that IntervMerge consistently outperforms the state-of-the-art approaches using fewer parameters.

* 10 pages, 6 figures, SIAM International Conference on Data Mining (SDM) 2025

Via

Access Paper or Ask Questions

A deep cut into Split Federated Self-supervised Learning

Jun 12, 2024

Marcin Przewięźlikowski, Marcin Osial, Bartosz Zieliński, Marek Śmieja

Abstract:Collaborative self-supervised learning has recently become feasible in highly distributed environments by dividing the network layers between client devices and a central server. However, state-of-the-art methods, such as MocoSFL, are optimized for network division at the initial layers, which decreases the protection of the client data and increases communication overhead. In this paper, we demonstrate that splitting depth is crucial for maintaining privacy and communication efficiency in distributed training. We also show that MocoSFL suffers from a catastrophic quality deterioration for the minimal communication overhead. As a remedy, we introduce Momentum-Aligned contrastive Split Federated Learning (MonAcoSFL), which aligns online and momentum client models during training procedure. Consequently, we achieve state-of-the-art accuracy while significantly reducing the communication overhead, making MonAcoSFL more practical in real-world scenarios.

* Accepted to European Conference on Machine Learning (ECML) 2024

Via

Access Paper or Ask Questions