Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Driss Matrouf

LIA

The distribution of calibrated likelihood functions on the probability-likelihood Aitchison simplex

Sep 03, 2025

Paul-Gauthier Noé, Andreas Nautsch, Driss Matrouf, Pierre-Michel Bousquet, Jean-François Bonastre

Abstract:While calibration of probabilistic predictions has been widely studied, this paper rather addresses calibration of likelihood functions. This has been discussed, especially in biometrics, in cases with only two exhaustive and mutually exclusive hypotheses (classes) where likelihood functions can be written as log-likelihood-ratios (LLRs). After defining calibration for LLRs and its connection with the concept of weight-of-evidence, we present the idempotence property and its associated constraint on the distribution of the LLRs. Although these results have been known for decades, they have been limited to the binary case. Here, we extend them to cases with more than two hypotheses by using the Aitchison geometry of the simplex, which allows us to recover, in a vector form, the additive form of the Bayes' rule; extending therefore the LLR and the weight-of-evidence to any number of hypotheses. Especially, we extend the definition of calibration, the idempotence, and the constraint on the distribution of likelihood functions to this multiple hypotheses and multiclass counterpart of the LLR: the isometric-log-ratio transformed likelihood function. This work is mainly conceptual, but we still provide one application to machine learning by presenting a non-linear discriminant analysis where the discriminant components form a calibrated likelihood function over the classes, improving therefore the interpretability and the reliability of the method.

* Preprint. Under review

Via

Access Paper or Ask Questions

Hiding speaker's sex in speech using zero-evidence speaker representation in an analysis/synthesis pipeline

Nov 29, 2022

Paul-Gauthier Noé, Xiaoxiao Miao, Xin Wang, Junichi Yamagishi, Jean-François Bonastre, Driss Matrouf

Abstract:The use of modern vocoders in an analysis/synthesis pipeline allows us to investigate high-quality voice conversion that can be used for privacy purposes. Here, we propose to transform the speaker embedding and the pitch in order to hide the sex of the speaker. ECAPA-TDNN-based speaker representation fed into a HiFiGAN vocoder is protected using a neural-discriminant analysis approach, which is consistent with the zero-evidence concept of privacy. This approach significantly reduces the information in speech related to the speaker's sex while preserving speech content and some consistency in the resulting protected voices.

* Submitted to ICASSP 2023, not peer-reviewed yet

Via

Access Paper or Ask Questions

How to Leverage DNN-based speech enhancement for multi-channel speaker verification?

Oct 17, 2022

Sandipana Dowerah, Romain Serizel, Denis Jouvet, Mohammad Mohammadamini, Driss Matrouf

Figure 1 for How to Leverage DNN-based speech enhancement for multi-channel speaker verification?

Abstract:Speaker verification (SV) suffers from unsatisfactory performance in far-field scenarios due to environmental noise andthe adverse impact of room reverberation. This work presents a benchmark of multichannel speech enhancement for far-fieldspeaker verification. One approach is a deep neural network-based, and the other is a combination of deep neural network andsignal processing. We integrated a DNN architecture with signal processing techniques to carry out various experiments. Ourapproach is compared to the existing state-of-the-art approaches. We examine the importance of enrollment in pre-processing,which has been largely overlooked in previous studies. Experimental evaluation shows that pre-processing can improve the SVperformance as long as the enrollment files are processed similarly to the test data and that test and enrollment occur within similarSNR ranges. Considerable improvement is obtained on the generated and all the noise conditions of the VOiCES dataset.

* 4th International Conference on Advances in Signal Processing and Artificial Intelligence (ASPAI' 2022), Oct 2022, Corfu, Greece

Via

Access Paper or Ask Questions

A bridge between features and evidence for binary attribute-driven perfect privacy

Oct 12, 2021

Paul-Gauthier Noé, Andreas Nautsch, Driss Matrouf, Pierre-Michel Bousquet, Jean-François Bonastre

Figure 1 for A bridge between features and evidence for binary attribute-driven perfect privacy

Figure 2 for A bridge between features and evidence for binary attribute-driven perfect privacy

Figure 3 for A bridge between features and evidence for binary attribute-driven perfect privacy

Figure 4 for A bridge between features and evidence for binary attribute-driven perfect privacy

Abstract:Attribute-driven privacy aims to conceal a single user's attribute, contrary to anonymisation that tries to hide the full identity of the user in some data. When the attribute to protect from malicious inferences is binary, perfect privacy requires the log-likelihood-ratio to be zero resulting in no strength-of-evidence. This work presents an approach based on normalizing flow that maps a feature vector into a latent space where the strength-of-evidence, related to the binary attribute, and an independent residual are disentangled. It can be seen as a non-linear discriminant analysis where the mapping is invertible allowing generation by mapping the latent variable back to the original space. This framework allows to manipulate the log-likelihood-ratio of the data and thus to set it to zero for privacy. We show the applicability of the approach on an attribute-driven privacy task where the sex information is removed from speaker embeddings. Results on VoxCeleb2 dataset show the efficiency of the method that outperforms in terms of privacy and utility our previous experiments based on adversarial disentanglement.

* Submitted to ICASSP 2022

Via

Access Paper or Ask Questions

Adversarial Disentanglement of Speaker Representation for Attribute-Driven Privacy Preservation

Dec 08, 2020

Paul-Gauthier Noé, Mohammad Mohammadamini, Driss Matrouf, Titouan Parcollet, Jean-François Bonastre

Figure 1 for Adversarial Disentanglement of Speaker Representation for Attribute-Driven Privacy Preservation

Figure 2 for Adversarial Disentanglement of Speaker Representation for Attribute-Driven Privacy Preservation

Figure 3 for Adversarial Disentanglement of Speaker Representation for Attribute-Driven Privacy Preservation

Figure 4 for Adversarial Disentanglement of Speaker Representation for Attribute-Driven Privacy Preservation

Abstract:With the increasing interest over speech technologies, numerous Automatic Speaker Verification (ASV) systems are employed to perform person identification. In the latter context, the systems rely on neural embeddings as a speaker representation. Nonetheless, such representations may contain privacy sensitive information about the speakers (e.g. age, sex, ethnicity, ...). In this paper, we introduce the concept of attribute-driven privacy preservation that enables a person to hide one or a few personal aspects to the authentication component. As a first solution we define an adversarial autoencoding method that disentangles a given speaker attribute from its neural representation. The proposed approach is assessed with a focus on the sex attribute. Experiments carried out using the VoxCeleb data sets have shown that the defined model enables the manipulation (i.e. variation or hiding) of this attribute while preserving good ASV performance.

Via

Access Paper or Ask Questions

Data augmentation versus noise compensation for x- vector speaker recognition systems in noisy environments

Jun 29, 2020

Mohammad Mohammadamini, Driss Matrouf

Figure 1 for Data augmentation versus noise compensation for x- vector speaker recognition systems in noisy environments

Figure 2 for Data augmentation versus noise compensation for x- vector speaker recognition systems in noisy environments

Figure 3 for Data augmentation versus noise compensation for x- vector speaker recognition systems in noisy environments

Figure 4 for Data augmentation versus noise compensation for x- vector speaker recognition systems in noisy environments

Abstract:The explosion of available speech data and new speaker modeling methods based on deep neural networks (DNN) have given the ability to develop more robust speaker recognition systems. Among DNN speaker modelling techniques, x-vector system has shown a degree of robustness in noisy environments. Previous studies suggest that by increasing the number of speakers in the training data and using data augmentation more robust speaker recognition systems are achievable in noisy environments. In this work, we want to know if explicit noise compensation techniques continue to be effective despite the general noise robustness of these systems. For this study, we will use two different x-vector networks: the first one is trained on Voxceleb1 (Protocol1), and the second one is trained on Voxceleb1+Voxveleb2 (Protocol2). We propose to add a denoising x-vector subsystem before scoring. Experimental results show that, the x-vector system used in Protocol2 is more robust than the other one used Protocol1. Despite this observation we will show that explicit noise compensation gives almost the same EER relative gain in both protocols. For example, in the Protocol2 we have 21% to 66% improvement of EER with denoising techniques.

Via

Access Paper or Ask Questions