Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ashkan Norouzi-Fard

The Cost of Consistency: Submodular Maximization with Constant Recourse

Dec 03, 2024

Paul Dütting, Federico Fusco, Silvio Lattanzi, Ashkan Norouzi-Fard, Ola Svensson, Morteza Zadimoghaddam

Figure 1 for The Cost of Consistency: Submodular Maximization with Constant Recourse

Figure 2 for The Cost of Consistency: Submodular Maximization with Constant Recourse

Figure 3 for The Cost of Consistency: Submodular Maximization with Constant Recourse

Figure 4 for The Cost of Consistency: Submodular Maximization with Constant Recourse

Abstract:In this work, we study online submodular maximization, and how the requirement of maintaining a stable solution impacts the approximation. In particular, we seek bounds on the best-possible approximation ratio that is attainable when the algorithm is allowed to make at most a constant number of updates per step. We show a tight information-theoretic bound of $\tfrac{2}{3}$ for general monotone submodular functions, and an improved (also tight) bound of $\tfrac{3}{4}$ for coverage functions. Since both these bounds are attained by non poly-time algorithms, we also give a poly-time randomized algorithm that achieves a $0.51$-approximation. Combined with an information-theoretic hardness of $\tfrac{1}{2}$ for deterministic algorithms from prior work, our work thus shows a separation between deterministic and randomized algorithms, both information theoretically and for poly-time algorithms.

Via

Access Paper or Ask Questions

Consistent Submodular Maximization

May 30, 2024

Paul Dütting, Federico Fusco, Silvio Lattanzi, Ashkan Norouzi-Fard, Morteza Zadimoghaddam

Figure 1 for Consistent Submodular Maximization

Figure 2 for Consistent Submodular Maximization

Figure 3 for Consistent Submodular Maximization

Abstract:Maximizing monotone submodular functions under cardinality constraints is a classic optimization task with several applications in data mining and machine learning. In this paper we study this problem in a dynamic environment with consistency constraints: elements arrive in a streaming fashion and the goal is maintaining a constant approximation to the optimal solution while having a stable solution (i.e., the number of changes between two consecutive solutions is bounded). We provide algorithms in this setting with different trade-offs between consistency and approximation quality. We also complement our theoretical results with an experimental analysis showing the effectiveness of our algorithms in real-world instances.

* To appear at ICML 24

Via

Access Paper or Ask Questions

Fairness in Submodular Maximization over a Matroid Constraint

Dec 21, 2023

Marwa El Halabi, Jakub Tarnawski, Ashkan Norouzi-Fard, Thuy-Duong Vuong

Figure 1 for Fairness in Submodular Maximization over a Matroid Constraint

Abstract:Submodular maximization over a matroid constraint is a fundamental problem with various applications in machine learning. Some of these applications involve decision-making over datapoints with sensitive attributes such as gender or race. In such settings, it is crucial to guarantee that the selected solution is fairly distributed with respect to this attribute. Recently, fairness has been investigated in submodular maximization under a cardinality constraint in both the streaming and offline settings, however the more general problem with matroid constraint has only been considered in the streaming setting and only for monotone objectives. This work fills this gap. We propose various algorithms and impossibility results offering different trade-offs between quality, fairness, and generality.

Via

Access Paper or Ask Questions

Fully Dynamic Submodular Maximization over Matroids

May 31, 2023

Paul Dütting, Federico Fusco, Silvio Lattanzi, Ashkan Norouzi-Fard, Morteza Zadimoghaddam

Abstract:Maximizing monotone submodular functions under a matroid constraint is a classic algorithmic problem with multiple applications in data mining and machine learning. We study this classic problem in the fully dynamic setting, where elements can be both inserted and deleted in real-time. Our main result is a randomized algorithm that maintains an efficient data structure with an $\tilde{O}(k^2)$ amortized update time (in the number of additions and deletions) and yields a $4$-approximate solution, where $k$ is the rank of the matroid.

* Accepted at ICML 2023

Via

Access Paper or Ask Questions

Fairness in Streaming Submodular Maximization over a Matroid Constraint

May 24, 2023

Marwa El Halabi, Federico Fusco, Ashkan Norouzi-Fard, Jakab Tardos, Jakub Tarnawski

Figure 1 for Fairness in Streaming Submodular Maximization over a Matroid Constraint

Abstract:Streaming submodular maximization is a natural model for the task of selecting a representative subset from a large-scale dataset. If datapoints have sensitive attributes such as gender or race, it becomes important to enforce fairness to avoid bias and discrimination. This has spurred significant interest in developing fair machine learning algorithms. Recently, such algorithms have been developed for monotone submodular maximization under a cardinality constraint. In this paper, we study the natural generalization of this problem to a matroid constraint. We give streaming algorithms as well as impossibility results that provide trade-offs between efficiency, quality and fairness. We validate our findings empirically on a range of well-known real-world applications: exemplar-based clustering, movie recommendation, and maximum coverage in social networks.

* Accepted to ICML 23

Via

Access Paper or Ask Questions

Deletion Robust Non-Monotone Submodular Maximization over Matroids

Aug 16, 2022

Paul Dütting, Federico Fusco, Silvio Lattanzi, Ashkan Norouzi-Fard, Morteza Zadimoghaddam

Figure 1 for Deletion Robust Non-Monotone Submodular Maximization over Matroids

Abstract:Maximizing a submodular function is a fundamental task in machine learning and in this paper we study the deletion robust version of the problem under the classic matroids constraint. Here the goal is to extract a small size summary of the dataset that contains a high value independent set even after an adversary deleted some elements. We present constant-factor approximation algorithms, whose space complexity depends on the rank $k$ of the matroid and the number $d$ of deleted elements. In the centralized setting we present a $(4.597+O(\varepsilon))$-approximation algorithm with summary size $O( \frac{k+d}{\varepsilon^2}\log \frac{k}{\varepsilon})$ that is improved to a $(3.582+O(\varepsilon))$-approximation with $O(k + \frac{d}{\varepsilon^2}\log \frac{k}{\varepsilon})$ summary size when the objective is monotone. In the streaming setting we provide a $(9.435 + O(\varepsilon))$-approximation algorithm with summary size and memory $O(k + \frac{d}{\varepsilon^2}\log \frac{k}{\varepsilon})$; the approximation factor is then improved to $(5.582+O(\varepsilon))$ in the monotone case.

* Preliminary versions of this work appeared as arXiv:2201.13128 and in ICML'22. The main difference with respect to these versions consists in extending our results to non-monotone submodular functions

Via

Access Paper or Ask Questions

Near-Optimal Correlation Clustering with Privacy

Mar 02, 2022

Vincent Cohen-Addad, Chenglin Fan, Silvio Lattanzi, Slobodan Mitrović, Ashkan Norouzi-Fard, Nikos Parotsidis, Jakub Tarnawski

Abstract:Correlation clustering is a central problem in unsupervised learning, with applications spanning community detection, duplicate detection, automated labelling and many more. In the correlation clustering problem one receives as input a set of nodes and for each node a list of co-clustering preferences, and the goal is to output a clustering that minimizes the disagreement with the specified nodes' preferences. In this paper, we introduce a simple and computationally efficient algorithm for the correlation clustering problem with provable privacy guarantees. Our approximation guarantees are stronger than those shown in prior work and are optimal up to logarithmic factors.

Via

Access Paper or Ask Questions

Deletion Robust Submodular Maximization over Matroids

Jan 31, 2022

Paul Dütting, Federico Fusco, Silvio Lattanzi, Ashkan Norouzi-Fard, Morteza Zadimoghaddam

Figure 1 for Deletion Robust Submodular Maximization over Matroids

Figure 2 for Deletion Robust Submodular Maximization over Matroids

Figure 3 for Deletion Robust Submodular Maximization over Matroids

Figure 4 for Deletion Robust Submodular Maximization over Matroids

Abstract:Maximizing a monotone submodular function is a fundamental task in machine learning. In this paper, we study the deletion robust version of the problem under the classic matroids constraint. Here the goal is to extract a small size summary of the dataset that contains a high value independent set even after an adversary deleted some elements. We present constant-factor approximation algorithms, whose space complexity depends on the rank $k$ of the matroid and the number $d$ of deleted elements. In the centralized setting we present a $(3.582+O(\varepsilon))$-approximation algorithm with summary size $O(k + \frac{d \log k}{\varepsilon^2})$. In the streaming setting we provide a $(5.582+O(\varepsilon))$-approximation algorithm with summary size and memory $O(k + \frac{d \log k}{\varepsilon^2})$. We complement our theoretical results with an in-depth experimental analysis showing the effectiveness of our algorithms on real-world datasets.

Via

Access Paper or Ask Questions

Correlation Clustering in Constant Many Parallel Rounds

Jun 15, 2021

Vincent Cohen-Addad, Silvio Lattanzi, Slobodan Mitrović, Ashkan Norouzi-Fard, Nikos Parotsidis, Jakub Tarnawski

Figure 1 for Correlation Clustering in Constant Many Parallel Rounds

Figure 2 for Correlation Clustering in Constant Many Parallel Rounds

Figure 3 for Correlation Clustering in Constant Many Parallel Rounds

Figure 4 for Correlation Clustering in Constant Many Parallel Rounds

Abstract:Correlation clustering is a central topic in unsupervised learning, with many applications in ML and data mining. In correlation clustering, one receives as input a signed graph and the goal is to partition it to minimize the number of disagreements. In this work we propose a massively parallel computation (MPC) algorithm for this problem that is considerably faster than prior work. In particular, our algorithm uses machines with memory sublinear in the number of nodes in the graph and returns a constant approximation while running only for a constant number of rounds. To the best of our knowledge, our algorithm is the first that can provably approximate a clustering problem on graphs using only a constant number of MPC rounds in the sublinear memory regime. We complement our analysis with an experimental analysis of our techniques.

* ICML 2021 (long talk)

Via

Access Paper or Ask Questions

Streaming Belief Propagation for Community Detection

Jun 10, 2021

Yuchen Wu, MohammadHossein Bateni, Andre Linhares, Filipe Miguel Goncalves de Almeida, Andrea Montanari, Ashkan Norouzi-Fard, Jakab Tardos

Figure 1 for Streaming Belief Propagation for Community Detection

Figure 2 for Streaming Belief Propagation for Community Detection

Figure 3 for Streaming Belief Propagation for Community Detection

Figure 4 for Streaming Belief Propagation for Community Detection

Abstract:The community detection problem requires to cluster the nodes of a network into a small number of well-connected "communities". There has been substantial recent progress in characterizing the fundamental statistical limits of community detection under simple stochastic block models. However, in real-world applications, the network structure is typically dynamic, with nodes that join over time. In this setting, we would like a detection algorithm to perform only a limited number of updates at each node arrival. While standard voting approaches satisfy this constraint, it is unclear whether they exploit the network information optimally. We introduce a simple model for networks growing over time which we refer to as streaming stochastic block model (StSBM). Within this model, we prove that voting algorithms have fundamental limitations. We also develop a streaming belief-propagation (StreamBP) approach, for which we prove optimality in certain regimes. We validate our theoretical findings on synthetic and real data.

* 36 pages, 13 figures

Via

Access Paper or Ask Questions