Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Satoshi Hasegawa

FedDuA: Doubly Adaptive Federated Learning

May 16, 2025

Shokichi Takakura, Seng Pei Liew, Satoshi Hasegawa

Abstract:Federated learning is a distributed learning framework where clients collaboratively train a global model without sharing their raw data. FedAvg is a popular algorithm for federated learning, but it often suffers from slow convergence due to the heterogeneity of local datasets and anisotropy in the parameter space. In this work, we formalize the central server optimization procedure through the lens of mirror descent and propose a novel framework, called FedDuA, which adaptively selects the global learning rate based on both inter-client and coordinate-wise heterogeneity in the local updates. We prove that our proposed doubly adaptive step-size rule is minimax optimal and provide a convergence analysis for convex objectives. Although the proposed method does not require additional communication or computational cost on clients, extensive numerical experiments show that our proposed framework outperforms baselines in various settings and is robust to the choice of hyperparameters.

Via

Access Paper or Ask Questions

Accelerating Differentially Private Federated Learning via Adaptive Extrapolation

Apr 14, 2025

Shokichi Takakura, Seng Pei Liew, Satoshi Hasegawa

Abstract:The federated learning (FL) framework enables multiple clients to collaboratively train machine learning models without sharing their raw data, but it remains vulnerable to privacy attacks. One promising approach is to incorporate differential privacy (DP)-a formal notion of privacy-into the FL framework. DP-FedAvg is one of the most popular algorithms for DP-FL, but it is known to suffer from the slow convergence in the presence of heterogeneity among clients' data. Most of the existing methods to accelerate DP-FL require 1) additional hyperparameters or 2) additional computational cost for clients, which is not desirable since 1) hyperparameter tuning is computationally expensive and data-dependent choice of hyperparameters raises the risk of privacy leakage, and 2) clients are often resource-constrained. To address this issue, we propose DP-FedEXP, which adaptively selects the global step size based on the diversity of the local updates without requiring any additional hyperparameters or client computational cost. We show that DP-FedEXP provably accelerates the convergence of DP-FedAvg and it empirically outperforms existing methods tailored for DP-FL.

Via

Access Paper or Ask Questions

Shuffled Check-in: Privacy Amplification towards Practical Distributed Learning

Jun 07, 2022

Seng Pei Liew, Satoshi Hasegawa, Tsubasa Takahashi

Figure 1 for Shuffled Check-in: Privacy Amplification towards Practical Distributed Learning

Figure 2 for Shuffled Check-in: Privacy Amplification towards Practical Distributed Learning

Figure 3 for Shuffled Check-in: Privacy Amplification towards Practical Distributed Learning

Figure 4 for Shuffled Check-in: Privacy Amplification towards Practical Distributed Learning

Abstract:Recent studies of distributed computation with formal privacy guarantees, such as differentially private (DP) federated learning, leverage random sampling of clients in each round (privacy amplification by subsampling) to achieve satisfactory levels of privacy. Achieving this however requires strong assumptions which may not hold in practice, including precise and uniform subsampling of clients, and a highly trusted aggregator to process clients' data. In this paper, we explore a more practical protocol, shuffled check-in, to resolve the aforementioned issues. The protocol relies on client making independent and random decision to participate in the computation, freeing the requirement of server-initiated subsampling, and enabling robust modelling of client dropouts. Moreover, a weaker trust model known as the shuffle model is employed instead of using a trusted aggregator. To this end, we introduce new tools to characterize the R\'enyi differential privacy (RDP) of shuffled check-in. We show that our new techniques improve at least three times in privacy guarantee over those using approximate DP's strong composition at various parameter regimes. Furthermore, we provide a numerical approach to track the privacy of generic shuffled check-in mechanism including distributed stochastic gradient descent (SGD) with Gaussian mechanism. To the best of our knowledge, this is also the first evaluation of Gaussian mechanism within the local/shuffle model under the distributed setting in the literature, which can be of independent interest.

* 16 pages, 4 figures

Via

Access Paper or Ask Questions

MEGEX: Data-Free Model Extraction Attack against Gradient-Based Explainable AI

Jul 19, 2021

Takayuki Miura, Satoshi Hasegawa, Toshiki Shibahara

Figure 1 for MEGEX: Data-Free Model Extraction Attack against Gradient-Based Explainable AI

Figure 2 for MEGEX: Data-Free Model Extraction Attack against Gradient-Based Explainable AI

Figure 3 for MEGEX: Data-Free Model Extraction Attack against Gradient-Based Explainable AI

Figure 4 for MEGEX: Data-Free Model Extraction Attack against Gradient-Based Explainable AI

Abstract:The advance of explainable artificial intelligence, which provides reasons for its predictions, is expected to accelerate the use of deep neural networks in the real world like Machine Learning as a Service (MLaaS) that returns predictions on queried data with the trained model. Deep neural networks deployed in MLaaS face the threat of model extraction attacks. A model extraction attack is an attack to violate intellectual property and privacy in which an adversary steals trained models in a cloud using only their predictions. In particular, a data-free model extraction attack has been proposed recently and is more critical. In this attack, an adversary uses a generative model instead of preparing input data. The feasibility of this attack, however, needs to be studied since it requires more queries than that with surrogate datasets. In this paper, we propose MEGEX, a data-free model extraction attack against a gradient-based explainable AI. In this method, an adversary uses the explanations to train the generative model and reduces the number of queries to steal the model. Our experiments show that our proposed method reconstructs high-accuracy models -- 0.97$\times$ and 0.98$\times$ the victim model accuracy on SVHN and CIFAR-10 datasets given 2M and 20M queries, respectively. This implies that there is a trade-off between the interpretability of models and the difficulty of stealing them.

* 10 pages, 5 figures

Via

Access Paper or Ask Questions