Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Pascal Germain

ULaval

Sample Compression for Continual Learning

Mar 13, 2025

Jacob Comeau, Mathieu Bazinet, Pascal Germain, Cem Subakan

Abstract:Continual learning algorithms aim to learn from a sequence of tasks, making the training distribution non-stationary. The majority of existing continual learning approaches in the literature rely on heuristics and do not provide learning guarantees for the continual learning setup. In this paper, we present a new method called 'Continual Pick-to-Learn' (CoP2L), which is able to retain the most representative samples for each task in an efficient way. The algorithm is adapted from the Pick-to-Learn algorithm, rooted in the sample compression theory. This allows us to provide high-confidence upper bounds on the generalization loss of the learned predictors, numerically computable after every update of the learned model. We also empirically show on several standard continual learning benchmarks that our algorithm is able to outperform standard experience replay, significantly mitigating catastrophic forgetting.

Via

Access Paper or Ask Questions

Sample Compression Hypernetworks: From Generalization Bounds to Meta-Learning

Oct 17, 2024

Benjamin Leblanc, Mathieu Bazinet, Nathaniel D'Amours, Alexandre Drouin, Pascal Germain

Abstract:Reconstruction functions are pivotal in sample compression theory, a framework for deriving tight generalization bounds. From a small sample of the training set (the compression set) and an optional stream of information (the message), they recover a predictor previously learned from the whole training set. While usually fixed, we propose to learn reconstruction functions. To facilitate the optimization and increase the expressiveness of the message, we derive a new sample compression generalization bound for real-valued messages. From this theoretical analysis, we then present a new hypernetwork architecture that outputs predictors with tight generalization guarantees when trained using an original meta-learning framework. The results of promising preliminary experiments are then reported.

* Accepted at the NeurIPS 2024 workshop on Compression in Machine Learning

Via

Access Paper or Ask Questions

Sample compression unleashed : New generalization bounds for real valued losses

Sep 26, 2024

Mathieu Bazinet, Valentina Zantedeschi, Pascal Germain

Abstract:The sample compression theory provides generalization guarantees for predictors that can be fully defined using a subset of the training dataset and a (short) message string, generally defined as a binary sequence. Previous works provided generalization bounds for the zero-one loss, which is restrictive, notably when applied to deep learning approaches. In this paper, we present a general framework for deriving new sample compression bounds that hold for real-valued losses. We empirically demonstrate the tightness of the bounds and their versatility by evaluating them on different types of models, e.g., neural networks and decision forests, trained with the Pick-To-Learn (P2L) meta-algorithm, which transforms the training method of any machine-learning predictor to yield sample-compressed predictors. In contrast to existing P2L bounds, ours are valid in the non-consistent case.

Via

Access Paper or Ask Questions

Phoneme Discretized Saliency Maps for Explainable Detection of AI-Generated Voice

Jun 14, 2024

Shubham Gupta, Mirco Ravanelli, Pascal Germain, Cem Subakan

Figure 1 for Phoneme Discretized Saliency Maps for Explainable Detection of AI-Generated Voice

Figure 2 for Phoneme Discretized Saliency Maps for Explainable Detection of AI-Generated Voice

Figure 3 for Phoneme Discretized Saliency Maps for Explainable Detection of AI-Generated Voice

Figure 4 for Phoneme Discretized Saliency Maps for Explainable Detection of AI-Generated Voice

Abstract:In this paper, we propose Phoneme Discretized Saliency Maps (PDSM), a discretization algorithm for saliency maps that takes advantage of phoneme boundaries for explainable detection of AI-generated voice. We experimentally show with two different Text-to-Speech systems (i.e., Tacotron2 and Fastspeech2) that the proposed algorithm produces saliency maps that result in more faithful explanations compared to standard posthoc explanation methods. Moreover, by associating the saliency maps to the phoneme representations, this methodology generates explanations that tend to be more understandable than standard saliency maps on magnitude spectrograms.

* Accepted to Interspeech 2024

Via

Access Paper or Ask Questions

Interpretability in Machine Learning: on the Interplay with Explainability, Predictive Performances and Models

Nov 20, 2023

Benjamin Leblanc, Pascal Germain

Abstract:Interpretability has recently gained attention in the field of machine learning, for it is crucial when it comes to high-stakes decisions or troubleshooting. This abstract concept is hard to grasp and has been associated, over time, with many labels and preconceived ideas. In this position paper, in order to clarify some misunderstandings regarding interpretability, we discuss its relationship with significant concepts in machine learning: explainability, predictive performances, and machine learning models. For instance, we challenge the idea that interpretability and explainability are substitutes to one another, or that a fixed degree of interpretability can be associated with a given machine learning model.

Via

Access Paper or Ask Questions

Statistical Guarantees for Variational Autoencoders using PAC-Bayesian Theory

Oct 07, 2023

Sokhna Diarra Mbacke, Florence Clerc, Pascal Germain

Abstract:Since their inception, Variational Autoencoders (VAEs) have become central in machine learning. Despite their widespread use, numerous questions regarding their theoretical properties remain open. Using PAC-Bayesian theory, this work develops statistical guarantees for VAEs. First, we derive the first PAC-Bayesian bound for posterior distributions conditioned on individual samples from the data-generating distribution. Then, we utilize this result to develop generalization guarantees for the VAE's reconstruction loss, as well as upper bounds on the distance between the input and the regenerated distributions. More importantly, we provide upper bounds on the Wasserstein distance between the input distribution and the distribution defined by the VAE's generative model.

* 37th Conference on Neural Information Processing Systems (NeurIPS 2023)

Via

Access Paper or Ask Questions

Invariant Causal Set Covering Machines

Jun 07, 2023

Thibaud Godon, Baptiste Bauvin, Pascal Germain, Jacques Corbeil, Alexandre Drouin

Figure 1 for Invariant Causal Set Covering Machines

Figure 2 for Invariant Causal Set Covering Machines

Figure 3 for Invariant Causal Set Covering Machines

Figure 4 for Invariant Causal Set Covering Machines

Abstract:Rule-based models, such as decision trees, appeal to practitioners due to their interpretable nature. However, the learning algorithms that produce such models are often vulnerable to spurious associations and thus, they are not guaranteed to extract causally-relevant insights. In this work, we build on ideas from the invariant causal prediction literature to propose Invariant Causal Set Covering Machines, an extension of the classical Set Covering Machine algorithm for conjunctions/disjunctions of binary-valued rules that provably avoids spurious associations. We demonstrate both theoretically and empirically that our method can identify the causal parents of a variable of interest in polynomial time.

Via

Access Paper or Ask Questions

PAC-Bayesian Generalization Bounds for Adversarial Generative Models

Feb 17, 2023

Sokhna Diarra Mbacke, Florence Clerc, Pascal Germain

Abstract:We extend PAC-Bayesian theory to generative models and develop generalization bounds for models based on the Wasserstein distance and the total variation distance. Our first result on the Wasserstein distance assumes the instance space is bounded, while our second result takes advantage of dimensionality reduction. Our results naturally apply to Wasserstein GANs and Energy-Based GANs, and our bounds provide new training objectives for these two. Although our work is mainly theoretical, we perform numerical experiments showing non-vacuous generalization bounds for Wasserstein GANs on synthetic datasets.

Via

Access Paper or Ask Questions

A Greedy Algorithm for Building Compact Binary Activated Neural Networks

Sep 07, 2022

Benjamin Leblanc, Pascal Germain

Figure 1 for A Greedy Algorithm for Building Compact Binary Activated Neural Networks

Figure 2 for A Greedy Algorithm for Building Compact Binary Activated Neural Networks

Figure 3 for A Greedy Algorithm for Building Compact Binary Activated Neural Networks

Figure 4 for A Greedy Algorithm for Building Compact Binary Activated Neural Networks

Abstract:We study binary activated neural networks in the context of regression tasks, provide guarantees on the expressiveness of these particular networks and propose a greedy algorithm for building such networks. Aiming for predictors having small resources needs, the greedy approach does not need to fix in advance an architecture for the network: this one is built one layer at a time, one neuron at a time, leading to predictors that aren't needlessly wide and deep for a given task. Similarly to boosting algorithms, our approach guarantees a training loss reduction every time a neuron is added to a layer. This greatly differs from most binary activated neural networks training schemes that rely on stochastic gradient descent (circumventing the 0-almost-everywhere derivative problem of the binary activation function by surrogates such as the straight through estimator or continuous binarization). We show that our method provides compact and sparse predictors while obtaining similar performances to state-of-the-art methods for training binary activated networks.

Via

Access Paper or Ask Questions

Learning Aggregations of Binary Activated Neural Networks with Probabilities over Representations

Oct 29, 2021

Louis Fortier-Dubois, Gaël Letarte, Benjamin Leblanc, François Laviolette, Pascal Germain

Figure 1 for Learning Aggregations of Binary Activated Neural Networks with Probabilities over Representations

Figure 2 for Learning Aggregations of Binary Activated Neural Networks with Probabilities over Representations

Figure 3 for Learning Aggregations of Binary Activated Neural Networks with Probabilities over Representations

Figure 4 for Learning Aggregations of Binary Activated Neural Networks with Probabilities over Representations

Abstract:Considering a probability distribution over parameters is known as an efficient strategy to learn a neural network with non-differentiable activation functions. We study the expectation of a probabilistic neural network as a predictor by itself, focusing on the aggregation of binary activated neural networks with normal distributions over real-valued weights. Our work leverages a recent analysis derived from the PAC-Bayesian framework that derives tight generalization bounds and learning procedures for the expected output value of such an aggregation, which is given by an analytical expression. While the combinatorial nature of the latter has been circumvented by approximations in previous works, we show that the exact computation remains tractable for deep but narrow neural networks, thanks to a dynamic programming approach. This leads us to a peculiar bound minimization learning algorithm for binary activated neural networks, where the forward pass propagates probabilities over representations instead of activation values. A stochastic counterpart of this new neural networks training scheme that scales to wider architectures is proposed.

Via

Access Paper or Ask Questions