Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Igor Colin

Asynchronous Gossip Algorithms for Rank-Based Statistical Methods

Sep 09, 2025

Anna Van Elst, Igor Colin, Stephan Clémençon

Abstract:As decentralized AI and edge intelligence become increasingly prevalent, ensuring robustness and trustworthiness in such distributed settings has become a critical issue-especially in the presence of corrupted or adversarial data. Traditional decentralized algorithms are vulnerable to data contamination as they typically rely on simple statistics (e.g., means or sum), motivating the need for more robust statistics. In line with recent work on decentralized estimation of trimmed means and ranks, we develop gossip algorithms for computing a broad class of rank-based statistics, including L-statistics and rank statistics-both known for their robustness to outliers. We apply our method to perform robust distributed two-sample hypothesis testing, introducing the first gossip algorithm for Wilcoxon rank-sum tests. We provide rigorous convergence guarantees, including the first convergence rate bound for asynchronous gossip-based rank estimation. We empirically validate our theoretical results through experiments on diverse network topologies.

Via

Access Paper or Ask Questions

Robust Distributed Estimation: Extending Gossip Algorithms to Ranking and Trimmed Means

May 23, 2025

Anna Van Elst, Igor Colin, Stephan Clémençon

Abstract:This paper addresses the problem of robust estimation in gossip algorithms over arbitrary communication graphs. Gossip algorithms are fully decentralized, relying only on local neighbor-to-neighbor communication, making them well-suited for situations where communication is constrained. A fundamental challenge in existing mean-based gossip algorithms is their vulnerability to malicious or corrupted nodes. In this paper, we show that an outlier-robust mean can be computed by globally estimating a robust statistic. More specifically, we propose a novel gossip algorithm for rank estimation, referred to as \textsc{GoRank}, and leverage it to design a gossip procedure dedicated to trimmed mean estimation, coined \textsc{GoTrim}. In addition to a detailed description of the proposed methods, a key contribution of our work is a precise convergence analysis: we establish an $\mathcal{O}(1/t)$ rate for rank estimation and an $\mathcal{O}(\log(t)/t)$ rate for trimmed mean estimation, where by $t$ is meant the number of iterations. Moreover, we provide a breakdown point analysis of \textsc{GoTrim}. We empirically validate our theoretical results through experiments on diverse network topologies, data distributions and contamination schemes.

Via

Access Paper or Ask Questions

Differentially Private Policy Gradient

Jan 31, 2025

Alexandre Rio, Merwan Barlier, Igor Colin

Abstract:Motivated by the increasing deployment of reinforcement learning in the real world, involving a large consumption of personal data, we introduce a differentially private (DP) policy gradient algorithm. We show that, in this setting, the introduction of Differential Privacy can be reduced to the computation of appropriate trust regions, thus avoiding the sacrifice of theoretical properties of the DP-less methods. Therefore, we show that it is possible to find the right trade-off between privacy noise and trust-region size to obtain a performant differentially private policy gradient algorithm. We then outline its performance empirically on various benchmarks. Our results and the complexity of the tasks addressed represent a significant improvement over existing DP algorithms in online RL.

Via

Access Paper or Ask Questions

Differentially Private Model-Based Offline Reinforcement Learning

Feb 08, 2024

Alexandre Rio, Merwan Barlier, Igor Colin, Albert Thomas

Figure 1 for Differentially Private Model-Based Offline Reinforcement Learning

Figure 2 for Differentially Private Model-Based Offline Reinforcement Learning

Figure 3 for Differentially Private Model-Based Offline Reinforcement Learning

Figure 4 for Differentially Private Model-Based Offline Reinforcement Learning

Abstract:We address offline reinforcement learning with privacy guarantees, where the goal is to train a policy that is differentially private with respect to individual trajectories in the dataset. To achieve this, we introduce DP-MORL, an MBRL algorithm coming with differential privacy guarantees. A private model of the environment is first learned from offline data using DP-FedAvg, a training method for neural networks that provides differential privacy guarantees at the trajectory level. Then, we use model-based policy optimization to derive a policy from the (penalized) private model, without any further interaction with the system or access to the input data. We empirically show that DP-MORL enables the training of private RL agents from offline data and we furthermore outline the price of privacy in this setting.

Via

Access Paper or Ask Questions

Price of Safety in Linear Best Arm Identification

Sep 15, 2023

Xuedong Shang, Igor Colin, Merwan Barlier, Hamza Cherkaoui

Figure 1 for Price of Safety in Linear Best Arm Identification

Abstract:We introduce the safe best-arm identification framework with linear feedback, where the agent is subject to some stage-wise safety constraint that linearly depends on an unknown parameter vector. The agent must take actions in a conservative way so as to ensure that the safety constraint is not violated with high probability at each round. Ways of leveraging the linear structure for ensuring safety has been studied for regret minimization, but not for best-arm identification to the best our knowledge. We propose a gap-based algorithm that achieves meaningful sample complexity while ensuring the stage-wise safety. We show that we pay an extra term in the sample complexity due to the forced exploration phase incurred by the additional safety constraint. Experimental illustrations are provided to justify the design of our algorithm.

* 20 pages, 1 figures

Via

Access Paper or Ask Questions

Clustered Multi-Agent Linear Bandits

Sep 15, 2023

Hamza Cherkaoui, Merwan Barlier, Igor Colin

Figure 1 for Clustered Multi-Agent Linear Bandits

Figure 2 for Clustered Multi-Agent Linear Bandits

Figure 3 for Clustered Multi-Agent Linear Bandits

Figure 4 for Clustered Multi-Agent Linear Bandits

Abstract:We address in this paper a particular instance of the multi-agent linear stochastic bandit problem, called clustered multi-agent linear bandits. In this setting, we propose a novel algorithm leveraging an efficient collaboration between the agents in order to accelerate the overall optimization problem. In this contribution, a network controller is responsible for estimating the underlying cluster structure of the network and optimizing the experiences sharing among agents within the same groups. We provide a theoretical analysis for both the regret minimization problem and the clustering quality. Through empirical evaluation against state-of-the-art algorithms on both synthetic and real data, we demonstrate the effectiveness of our approach: our algorithm significantly improves regret minimization while managing to recover the true underlying cluster partitioning.

* 18 pages, 8 figures

Via

Access Paper or Ask Questions

An $α$-No-Regret Algorithm For Graphical Bilinear Bandits

Jun 01, 2022

Geovani Rizk, Igor Colin, Albert Thomas, Rida Laraki, Yann Chevaleyre

Figure 1 for An $α$-No-Regret Algorithm For Graphical Bilinear Bandits

Figure 2 for An $α$-No-Regret Algorithm For Graphical Bilinear Bandits

Figure 3 for An $α$-No-Regret Algorithm For Graphical Bilinear Bandits

Abstract:We propose the first regret-based approach to the Graphical Bilinear Bandits problem, where $n$ agents in a graph play a stochastic bilinear bandit game with each of their neighbors. This setting reveals a combinatorial NP-hard problem that prevents the use of any existing regret-based algorithm in the (bi-)linear bandit literature. In this paper, we fill this gap and present the first regret-based algorithm for graphical bilinear bandits using the principle of optimism in the face of uncertainty. Theoretical analysis of this new method yields an upper bound of $\tilde{O}(\sqrt{T})$ on the $\alpha$-regret and evidences the impact of the graph structure on the rate of convergence. Finally, we show through various experiments the validity of our approach.

Via

Access Paper or Ask Questions

Refined bounds for randomized experimental design

Dec 22, 2020

Geovani Rizk, Igor Colin, Albert Thomas, Moez Draief

Abstract:Experimental design is an approach for selecting samples among a given set so as to obtain the best estimator for a given criterion. In the context of linear regression, several optimal designs have been derived, each associated with a different criterion: mean square error, robustness, \emph{etc}. Computing such designs is generally an NP-hard problem and one can instead rely on a convex relaxation that considers probability distributions over the samples. Although greedy strategies and rounding procedures have received a lot of attention, straightforward sampling from the optimal distribution has hardly been investigated. In this paper, we propose theoretical guarantees for randomized strategies on E and G-optimal design. To this end, we develop a new concentration inequality for the eigenvalues of random matrices using a refined version of the intrinsic dimension that enables us to quantify the performance of such randomized strategies. Finally, we evidence the validity of our analysis through experiments, with particular attention on the G-optimal design applied to the best arm identification problem for linear bandits.

Via

Access Paper or Ask Questions

Best Arm Identification in Graphical Bilinear Bandits

Dec 14, 2020

Geovani Rizk, Albert Thomas, Igor Colin, Rida Laraki, Yann Chevaleyre

Figure 1 for Best Arm Identification in Graphical Bilinear Bandits

Figure 2 for Best Arm Identification in Graphical Bilinear Bandits

Figure 3 for Best Arm Identification in Graphical Bilinear Bandits

Abstract:We introduce a new graphical bilinear bandit problem where a learner (or a \emph{central entity}) allocates arms to the nodes of a graph and observes for each edge a noisy bilinear reward representing the interaction between the two end nodes. We study the best arm identification problem in which the learner wants to find the graph allocation maximizing the sum of the bilinear rewards. By efficiently exploiting the geometry of this bandit problem, we propose a somehow \emph{decentralized} allocation strategy based on random sampling with theoretical guarantees. In particular, we characterize the influence of the graph structure (e.g. star, complete or circle) on the convergence rate and propose empirical experiments that confirm this dependency.

Via

Access Paper or Ask Questions

Theoretical Limits of Pipeline Parallel Optimization and Application to Distributed Deep Learning

Oct 11, 2019

Igor Colin, Ludovic Dos Santos, Kevin Scaman

Figure 1 for Theoretical Limits of Pipeline Parallel Optimization and Application to Distributed Deep Learning

Figure 2 for Theoretical Limits of Pipeline Parallel Optimization and Application to Distributed Deep Learning

Figure 3 for Theoretical Limits of Pipeline Parallel Optimization and Application to Distributed Deep Learning

Abstract:We investigate the theoretical limits of pipeline parallel learning of deep learning architectures, a distributed setup in which the computation is distributed per layer instead of per example. For smooth convex and non-convex objective functions, we provide matching lower and upper complexity bounds and show that a naive pipeline parallelization of Nesterov's accelerated gradient descent is optimal. For non-smooth convex functions, we provide a novel algorithm coined Pipeline Parallel Random Smoothing (PPRS) that is within a $d^{1/4}$ multiplicative factor of the optimal convergence rate, where $d$ is the underlying dimension. While the convergence rate still obeys a slow $\varepsilon^{-2}$ convergence rate, the depth-dependent part is accelerated, resulting in a near-linear speed-up and convergence time that only slightly depends on the depth of the deep learning architecture. Finally, we perform an empirical analysis of the non-smooth non-convex case and show that, for difficult and highly non-smooth problems, PPRS outperforms more traditional optimization algorithms such as gradient descent and Nesterov's accelerated gradient descent for problems where the sample size is limited, such as few-shot or adversarial learning.

Via

Access Paper or Ask Questions