Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jan Schuchardt

Sampling-Free Privacy Accounting for Matrix Mechanisms under Random Allocation

Jan 29, 2026

Jan Schuchardt, Nikita Kalinin

Abstract:We study privacy amplification for differentially private model training with matrix factorization under random allocation (also known as the balls-in-bins model). Recent work by Choquette-Choo et al. (2025) proposes a sampling-based Monte Carlo approach to compute amplification parameters in this setting. However, their guarantees either only hold with some high probability or require random abstention by the mechanism. Furthermore, the required number of samples for ensuring $(ε,δ)$-DP is inversely proportional to $δ$. In contrast, we develop sampling-free bounds based on Rényi divergence and conditional composition. The former is facilitated by a dynamic programming formulation to efficiently compute the bounds. The latter complements it by offering stronger privacy guarantees for small $ε$, where Rényi divergence bounds inherently lead to an over-approximation. Our framework applies to arbitrary banded and non-banded matrices. Through numerical comparisons, we demonstrate the efficacy of our approach across a broad range of matrix mechanisms used in research and practice.

Via

Access Paper or Ask Questions

Privacy Amplification by Structured Subsampling for Deep Differentially Private Time Series Forecasting

Feb 04, 2025

Jan Schuchardt, Mina Dalirrooyfard, Jed Guzelkabaagac, Anderson Schneider, Yuriy Nevmyvaka, Stephan Günnemann

Abstract:Many forms of sensitive data, such as web traffic, mobility data, or hospital occupancy, are inherently sequential. The standard method for training machine learning models while ensuring privacy for units of sensitive information, such as individual hospital visits, is differentially private stochastic gradient descent (DP-SGD). However, we observe in this work that the formal guarantees of DP-SGD are incompatible with timeseries-specific tasks like forecasting, since they rely on the privacy amplification attained by training on small, unstructured batches sampled from an unstructured dataset. In contrast, batches for forecasting are generated by (1) sampling sequentially structured time series from a dataset, (2) sampling contiguous subsequences from these series, and (3) partitioning them into context and ground-truth forecast windows. We theoretically analyze the privacy amplification attained by this structured subsampling to enable the training of forecasting models with sound and tight event- and user-level privacy guarantees. Towards more private models, we additionally prove how data augmentation amplifies privacy in self-supervised training of sequence models. Our empirical evaluation demonstrates that amplification by structured subsampling enables the training of forecasting models with strong formal privacy guarantees.

Via

Access Paper or Ask Questions

Group Privacy Amplification and Unified Amplification by Subsampling for Rényi Differential Privacy

Mar 07, 2024

Jan Schuchardt, Mihail Stoian, Arthur Kosmala, Stephan Günnemann

Figure 1 for Group Privacy Amplification and Unified Amplification by Subsampling for Rényi Differential Privacy

Figure 2 for Group Privacy Amplification and Unified Amplification by Subsampling for Rényi Differential Privacy

Figure 3 for Group Privacy Amplification and Unified Amplification by Subsampling for Rényi Differential Privacy

Figure 4 for Group Privacy Amplification and Unified Amplification by Subsampling for Rényi Differential Privacy

Abstract:Differential privacy (DP) has various desirable properties, such as robustness to post-processing, group privacy, and amplification by subsampling, which can be derived independently of each other. Our goal is to determine whether stronger privacy guarantees can be obtained by considering multiple of these properties jointly. To this end, we focus on the combination of group privacy and amplification by subsampling. To provide guarantees that are amenable to machine learning algorithms, we conduct our analysis in the framework of R\'enyi-DP, which has more favorable composition properties than $(\epsilon,\delta)$-DP. As part of this analysis, we develop a unified framework for deriving amplification by subsampling guarantees for R\'enyi-DP, which represents the first such framework for a privacy accounting method and is of independent interest. We find that it not only lets us improve upon and generalize existing amplification results for R\'enyi-DP, but also derive provably tight group privacy amplification guarantees stronger than existing principles. These results establish the joint study of different DP properties as a promising research direction.

Via

Access Paper or Ask Questions

(Provable) Adversarial Robustness for Group Equivariant Tasks: Graphs, Point Clouds, Molecules, and More

Dec 05, 2023

Jan Schuchardt, Yan Scholten, Stephan Günnemann

Figure 1 for (Provable) Adversarial Robustness for Group Equivariant Tasks: Graphs, Point Clouds, Molecules, and More

Figure 2 for (Provable) Adversarial Robustness for Group Equivariant Tasks: Graphs, Point Clouds, Molecules, and More

Figure 3 for (Provable) Adversarial Robustness for Group Equivariant Tasks: Graphs, Point Clouds, Molecules, and More

Figure 4 for (Provable) Adversarial Robustness for Group Equivariant Tasks: Graphs, Point Clouds, Molecules, and More

Abstract:A machine learning model is traditionally considered robust if its prediction remains (almost) constant under input perturbations with small norm. However, real-world tasks like molecular property prediction or point cloud segmentation have inherent equivariances, such as rotation or permutation equivariance. In such tasks, even perturbations with large norm do not necessarily change an input's semantic content. Furthermore, there are perturbations for which a model's prediction explicitly needs to change. For the first time, we propose a sound notion of adversarial robustness that accounts for task equivariance. We then demonstrate that provable robustness can be achieved by (1) choosing a model that matches the task's equivariances (2) certifying traditional adversarial robustness. Certification methods are, however, unavailable for many models, such as those with continuous equivariances. We close this gap by developing the framework of equivariance-preserving randomized smoothing, which enables architecture-agnostic certification. We additionally derive the first architecture-specific graph edit distance certificates, i.e. sound robustness guarantees for isomorphism equivariant tasks like node classification. Overall, a sound notion of robustness is an important prerequisite for future work at the intersection of robust and geometric machine learning.

* Accepted at NeurIPS 2023

Via

Access Paper or Ask Questions

Hierarchical Randomized Smoothing

Oct 24, 2023

Yan Scholten, Jan Schuchardt, Aleksandar Bojchevski, Stephan Günnemann

Figure 1 for Hierarchical Randomized Smoothing

Figure 2 for Hierarchical Randomized Smoothing

Figure 3 for Hierarchical Randomized Smoothing

Figure 4 for Hierarchical Randomized Smoothing

Abstract:Real-world data is complex and often consists of objects that can be decomposed into multiple entities (e.g. images into pixels, graphs into interconnected nodes). Randomized smoothing is a powerful framework for making models provably robust against small changes to their inputs - by guaranteeing robustness of the majority vote when randomly adding noise before classification. Yet, certifying robustness on such complex data via randomized smoothing is challenging when adversaries do not arbitrarily perturb entire objects (e.g. images) but only a subset of their entities (e.g. pixels). As a solution, we introduce hierarchical randomized smoothing: We partially smooth objects by adding random noise only on a randomly selected subset of their entities. By adding noise in a more targeted manner than existing methods we obtain stronger robustness guarantees while maintaining high accuracy. We initialize hierarchical smoothing using different noising distributions, yielding novel robustness certificates for discrete and continuous domains. We experimentally demonstrate the importance of hierarchical smoothing in image and node classification, where it yields superior robustness-accuracy trade-offs. Overall, hierarchical smoothing is an important contribution towards models that are both - certifiably robust to perturbations and accurate.

Via

Access Paper or Ask Questions

Collective Robustness Certificates: Exploiting Interdependence in Graph Neural Networks

Feb 06, 2023

Jan Schuchardt, Aleksandar Bojchevski, Johannes Gasteiger, Stephan Günnemann

Figure 1 for Collective Robustness Certificates: Exploiting Interdependence in Graph Neural Networks

Figure 2 for Collective Robustness Certificates: Exploiting Interdependence in Graph Neural Networks

Figure 3 for Collective Robustness Certificates: Exploiting Interdependence in Graph Neural Networks

Figure 4 for Collective Robustness Certificates: Exploiting Interdependence in Graph Neural Networks

Abstract:In tasks like node classification, image segmentation, and named-entity recognition we have a classifier that simultaneously outputs multiple predictions (a vector of labels) based on a single input, i.e. a single graph, image, or document respectively. Existing adversarial robustness certificates consider each prediction independently and are thus overly pessimistic for such tasks. They implicitly assume that an adversary can use different perturbed inputs to attack different predictions, ignoring the fact that we have a single shared input. We propose the first collective robustness certificate which computes the number of predictions that are simultaneously guaranteed to remain stable under perturbation, i.e. cannot be attacked. We focus on Graph Neural Networks and leverage their locality property - perturbations only affect the predictions in a close neighborhood - to fuse multiple single-node certificates into a drastically stronger collective certificate. For example, on the Citeseer dataset our collective certificate for node classification increases the average number of certifiable feature perturbations from $7$ to $351$.

* Accepted at ICLR 2021 (https://openreview.net/forum?id=ULQdiUTHe3y). Uploaded to arxiv to fix Google Scholar indexing

Via

Access Paper or Ask Questions

Randomized Message-Interception Smoothing: Gray-box Certificates for Graph Neural Networks

Jan 05, 2023

Yan Scholten, Jan Schuchardt, Simon Geisler, Aleksandar Bojchevski, Stephan Günnemann

Figure 1 for Randomized Message-Interception Smoothing: Gray-box Certificates for Graph Neural Networks

Figure 2 for Randomized Message-Interception Smoothing: Gray-box Certificates for Graph Neural Networks

Figure 3 for Randomized Message-Interception Smoothing: Gray-box Certificates for Graph Neural Networks

Figure 4 for Randomized Message-Interception Smoothing: Gray-box Certificates for Graph Neural Networks

Abstract:Randomized smoothing is one of the most promising frameworks for certifying the adversarial robustness of machine learning models, including Graph Neural Networks (GNNs). Yet, existing randomized smoothing certificates for GNNs are overly pessimistic since they treat the model as a black box, ignoring the underlying architecture. To remedy this, we propose novel gray-box certificates that exploit the message-passing principle of GNNs: We randomly intercept messages and carefully analyze the probability that messages from adversarially controlled nodes reach their target nodes. Compared to existing certificates, we certify robustness to much stronger adversaries that control entire nodes in the graph and can arbitrarily manipulate node features. Our certificates provide stronger guarantees for attacks at larger distances, as messages from farther-away nodes are more likely to get intercepted. We demonstrate the effectiveness of our method on various models and datasets. Since our gray-box certificates consider the underlying graph structure, we can significantly improve certifiable robustness by applying graph sparsification.

Via

Access Paper or Ask Questions

Training Differentially Private Graph Neural Networks with Random Walk Sampling

Jan 02, 2023

Morgane Ayle, Jan Schuchardt, Lukas Gosch, Daniel Zügner, Stephan Günnemann

Abstract:Deep learning models are known to put the privacy of their training data at risk, which poses challenges for their safe and ethical release to the public. Differentially private stochastic gradient descent is the de facto standard for training neural networks without leaking sensitive information about the training data. However, applying it to models for graph-structured data poses a novel challenge: unlike with i.i.d. data, sensitive information about a node in a graph cannot only leak through its gradients, but also through the gradients of all nodes within a larger neighborhood. In practice, this limits privacy-preserving deep learning on graphs to very shallow graph neural networks. We propose to solve this issue by training graph neural networks on disjoint subgraphs of a given training graph. We develop three random-walk-based methods for generating such disjoint subgraphs and perform a careful analysis of the data-generating distributions to provide strong privacy guarantees. Through extensive experiments, we show that our method greatly outperforms the state-of-the-art baseline on three large graphs, and matches or outperforms it on four smaller ones.

* Accepted at the Trustworthy and Socially Responsible Machine Learning Workshop of NeurIPS 2022

Via

Access Paper or Ask Questions

Invariance-Aware Randomized Smoothing Certificates

Nov 25, 2022

Jan Schuchardt, Stephan Günnemann

Figure 1 for Invariance-Aware Randomized Smoothing Certificates

Figure 2 for Invariance-Aware Randomized Smoothing Certificates

Figure 3 for Invariance-Aware Randomized Smoothing Certificates

Figure 4 for Invariance-Aware Randomized Smoothing Certificates

Abstract:Building models that comply with the invariances inherent to different domains, such as invariance under translation or rotation, is a key aspect of applying machine learning to real world problems like molecular property prediction, medical imaging, protein folding or LiDAR classification. For the first time, we study how the invariances of a model can be leveraged to provably guarantee the robustness of its predictions. We propose a gray-box approach, enhancing the powerful black-box randomized smoothing technique with white-box knowledge about invariances. First, we develop gray-box certificates based on group orbits, which can be applied to arbitrary models with invariance under permutation and Euclidean isometries. Then, we derive provably tight gray-box certificates. We experimentally demonstrate that the provably tight certificates can offer much stronger guarantees, but that in practical scenarios the orbit-based method is a good approximation.

* Accepted at NeurIPS 2022

Via

Access Paper or Ask Questions

Localized Randomized Smoothing for Collective Robustness Certification

Oct 28, 2022

Jan Schuchardt, Tom Wollschläger, Aleksandar Bojchevski, Stephan Günnemann

Abstract:Models for image segmentation, node classification and many other tasks map a single input to multiple labels. By perturbing this single shared input (e.g. the image) an adversary can manipulate several predictions (e.g. misclassify several pixels). Collective robustness certification is the task of provably bounding the number of robust predictions under this threat model. The only dedicated method that goes beyond certifying each output independently is limited to strictly local models, where each prediction is associated with a small receptive field. We propose a more general collective robustness certificate for all types of models and further show that this approach is beneficial for the larger class of softly local models, where each output is dependent on the entire input but assigns different levels of importance to different input regions (e.g. based on their proximity in the image). The certificate is based on our novel localized randomized smoothing approach, where the random perturbation strength for different input regions is proportional to their importance for the outputs. Localized smoothing Pareto-dominates existing certificates on both image segmentation and node classification tasks, simultaneously offering higher accuracy and stronger guarantees.

* 9 pages

Via

Access Paper or Ask Questions