Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Firas Laakom

Fairness Overfitting in Machine Learning: An Information-Theoretic Perspective

Jun 09, 2025

Firas Laakom, Haobo Chen, Jürgen Schmidhuber, Yuheng Bu

Abstract:Despite substantial progress in promoting fairness in high-stake applications using machine learning models, existing methods often modify the training process, such as through regularizers or other interventions, but lack formal guarantees that fairness achieved during training will generalize to unseen data. Although overfitting with respect to prediction performance has been extensively studied, overfitting in terms of fairness loss has received far less attention. This paper proposes a theoretical framework for analyzing fairness generalization error through an information-theoretic lens. Our novel bounding technique is based on Efron-Stein inequality, which allows us to derive tight information-theoretic fairness generalization bounds with both Mutual Information (MI) and Conditional Mutual Information (CMI). Our empirical results validate the tightness and practical relevance of these bounds across diverse fairness-aware learning algorithms. Our framework offers valuable insights to guide the design of algorithms improving fairness generalization.

* 38 pages

Via

Access Paper or Ask Questions

FACTS: A Factored State-Space Framework For World Modelling

Oct 28, 2024

Li Nanbo, Firas Laakom, Yucheng Xu, Wenyi Wang, Jürgen Schmidhuber

Figure 1 for FACTS: A Factored State-Space Framework For World Modelling

Figure 2 for FACTS: A Factored State-Space Framework For World Modelling

Figure 3 for FACTS: A Factored State-Space Framework For World Modelling

Figure 4 for FACTS: A Factored State-Space Framework For World Modelling

Abstract:World modelling is essential for understanding and predicting the dynamics of complex systems by learning both spatial and temporal dependencies. However, current frameworks, such as Transformers and selective state-space models like Mambas, exhibit limitations in efficiently encoding spatial and temporal structures, particularly in scenarios requiring long-term high-dimensional sequence modelling. To address these issues, we propose a novel recurrent framework, the \textbf{FACT}ored \textbf{S}tate-space (\textbf{FACTS}) model, for spatial-temporal world modelling. The FACTS framework constructs a graph-structured memory with a routing mechanism that learns permutable memory representations, ensuring invariance to input permutations while adapting through selective state-space propagation. Furthermore, FACTS supports parallel computation of high-dimensional sequences. We empirically evaluate FACTS across diverse tasks, including multivariate time series forecasting and object-centric world modelling, demonstrating that it consistently outperforms or matches specialised state-of-the-art models, despite its general-purpose world modelling design.

* Code released in https://github.com/NanboLi/FACTS

Via

Access Paper or Ask Questions

Pixel-Wise Color Constancy via Smoothness Techniques in Multi-Illuminant Scenes

Feb 05, 2024

Umut Cem Entok, Firas Laakom, Farhad Pakdaman, Moncef Gabbouj

Abstract:Most scenes are illuminated by several light sources, where the traditional assumption of uniform illumination is invalid. This issue is ignored in most color constancy methods, primarily due to the complex spatial impact of multiple light sources on the image. Moreover, most existing multi-illuminant methods fail to preserve the smooth change of illumination, which stems from spatial dependencies in natural images. Motivated by this, we propose a novel multi-illuminant color constancy method, by learning pixel-wise illumination maps caused by multiple light sources. The proposed method enforces smoothness within neighboring pixels, by regularizing the training with the total variation loss. Moreover, a bilateral filter is provisioned further to enhance the natural appearance of the estimated images, while preserving the edges. Additionally, we propose a label-smoothing technique that enables the model to generalize well despite the uncertainties in ground truth. Quantitative and qualitative experiments demonstrate that the proposed method outperforms the state-of-the-art.

Via

Access Paper or Ask Questions

Class-wise Generalization Error: an Information-Theoretic Analysis

Jan 05, 2024

Firas Laakom, Yuheng Bu, Moncef Gabbouj

Abstract:Existing generalization theories of supervised learning typically take a holistic approach and provide bounds for the expected generalization over the whole data distribution, which implicitly assumes that the model generalizes similarly for all the classes. In practice, however, there are significant variations in generalization performance among different classes, which cannot be captured by the existing generalization bounds. In this work, we tackle this problem by theoretically studying the class-generalization error, which quantifies the generalization performance of each individual class. We derive a novel information-theoretic bound for class-generalization error using the KL divergence, and we further obtain several tighter bounds using the conditional mutual information (CMI), which are significantly easier to estimate in practice. We empirically validate our proposed bounds in different neural networks and show that they accurately capture the complex class-generalization error behavior. Moreover, we show that the theoretical tools developed in this paper can be applied in several applications beyond this context.

* 26 pages

Via

Access Paper or Ask Questions

Newton Method-based Subspace Support Vector Data Description

Sep 25, 2023

Fahad Sohrab, Firas Laakom, Moncef Gabbouj

Abstract:In this paper, we present an adaptation of Newton's method for the optimization of Subspace Support Vector Data Description (S-SVDD). The objective of S-SVDD is to map the original data to a subspace optimized for one-class classification, and the iterative optimization process of data mapping and description in S-SVDD relies on gradient descent. However, gradient descent only utilizes first-order information, which may lead to suboptimal results. To address this limitation, we leverage Newton's method to enhance data mapping and data description for an improved optimization of subspace learning-based one-class classification. By incorporating this auxiliary information, Newton's method offers a more efficient strategy for subspace learning in one-class classification as compared to gradient-based optimization. The paper discusses the limitations of gradient descent and the advantages of using Newton's method in subspace learning for one-class classification tasks. We provide both linear and nonlinear formulations of Newton's method-based optimization for S-SVDD. In our experiments, we explored both the minimization and maximization strategies of the objective. The results demonstrate that the proposed optimization strategy outperforms the gradient-based S-SVDD in most cases.

* 8 pages, 2 figures, 2 tables, 1 Algorithm. Accepted at IEEE Symposium Series on Computational Intelligence 2023

Via

Access Paper or Ask Questions

Convolutional autoencoder-based multimodal one-class classification

Sep 25, 2023

Firas Laakom, Fahad Sohrab, Jenni Raitoharju, Alexandros Iosifidis, Moncef Gabbouj

Figure 1 for Convolutional autoencoder-based multimodal one-class classification

Figure 2 for Convolutional autoencoder-based multimodal one-class classification

Figure 3 for Convolutional autoencoder-based multimodal one-class classification

Figure 4 for Convolutional autoencoder-based multimodal one-class classification

Abstract:One-class classification refers to approaches of learning using data from a single class only. In this paper, we propose a deep learning one-class classification method suitable for multimodal data, which relies on two convolutional autoencoders jointly trained to reconstruct the positive input data while obtaining the data representations in the latent space as compact as possible. During inference, the distance of the latent representation of an input to the origin can be used as an anomaly score. Experimental results using a multimodal macroinvertebrate image classification dataset show that the proposed multimodal method yields better results as compared to the unimodal approach. Furthermore, study the effect of different input image sizes, and we investigate how recently proposed feature diversity regularizers affect the performance of our approach. We show that such regularizers improve performance.

* 5 pages, 1 figure, 4 tables

Via

Access Paper or Ask Questions

On Feature Diversity in Energy-based Models

Jun 02, 2023

Firas Laakom, Jenni Raitoharju, Alexandros Iosifidis, Moncef Gabbouj

Abstract:Energy-based learning is a powerful learning paradigm that encapsulates various discriminative and generative approaches. An energy-based model (EBM) is typically formed of inner-model(s) that learn a combination of the different features to generate an energy mapping for each input configuration. In this paper, we focus on the diversity of the produced feature set. We extend the probably approximately correct (PAC) theory of EBMs and analyze the effect of redundancy reduction on the performance of EBMs. We derive generalization bounds for various learning contexts, i.e., regression, classification, and implicit regression, with different energy functions and we show that indeed reducing redundancy of the feature set can consistently decrease the gap between the true and empirical expectation of the energy and boosts the performance of the model.

* 18 pages, 3 figures

Via

Access Paper or Ask Questions

WLD-Reg: A Data-dependent Within-layer Diversity Regularizer

Jan 03, 2023

Firas Laakom, Jenni Raitoharju, Alexandros Iosifidis, Moncef Gabbouj

Figure 1 for WLD-Reg: A Data-dependent Within-layer Diversity Regularizer

Figure 2 for WLD-Reg: A Data-dependent Within-layer Diversity Regularizer

Figure 3 for WLD-Reg: A Data-dependent Within-layer Diversity Regularizer

Figure 4 for WLD-Reg: A Data-dependent Within-layer Diversity Regularizer

Abstract:Neural networks are composed of multiple layers arranged in a hierarchical structure jointly trained with a gradient-based optimization, where the errors are back-propagated from the last layer back to the first one. At each optimization step, neurons at a given layer receive feedback from neurons belonging to higher layers of the hierarchy. In this paper, we propose to complement this traditional 'between-layer' feedback with additional 'within-layer' feedback to encourage the diversity of the activations within the same layer. To this end, we measure the pairwise similarity between the outputs of the neurons and use it to model the layer's overall diversity. We present an extensive empirical study confirming that the proposed approach enhances the performance of several state-of-the-art neural network models in multiple tasks. The code is publically available at \url{https://github.com/firasl/AAAI-23-WLD-Reg}

* accepted at AAAI 2023. arXiv admin note: substantial text overlap with arXiv:2106.06012

Via

Access Paper or Ask Questions

Efficient CNN with uncorrelated Bag of Features pooling

Sep 22, 2022

Firas Laakom, Jenni Raitoharju, Alexandros Iosifidis, Moncef Gabbouj

Figure 1 for Efficient CNN with uncorrelated Bag of Features pooling

Figure 2 for Efficient CNN with uncorrelated Bag of Features pooling

Figure 3 for Efficient CNN with uncorrelated Bag of Features pooling

Figure 4 for Efficient CNN with uncorrelated Bag of Features pooling

Abstract:Despite the superior performance of CNN, deploying them on low computational power devices is still limited as they are typically computationally expensive. One key cause of the high complexity is the connection between the convolution layers and the fully connected layers, which typically requires a high number of parameters. To alleviate this issue, Bag of Features (BoF) pooling has been recently proposed. BoF learns a dictionary, that is used to compile a histogram representation of the input. In this paper, we propose an approach that builds on top of BoF pooling to boost its efficiency by ensuring that the items of the learned dictionary are non-redundant. We propose an additional loss term, based on the pair-wise correlation of the items of the dictionary, which complements the standard loss to explicitly regularize the model to learn a more diverse and rich dictionary. The proposed strategy yields an efficient variant of BoF and further boosts its performance, without any additional parameters.

* 6 pages, 2 Figures

Via

Access Paper or Ask Questions

Non-Linear Spectral Dimensionality Reduction Under Uncertainty

Feb 09, 2022

Firas Laakom, Jenni Raitoharju, Nikolaos Passalis, Alexandros Iosifidis, Moncef Gabbouj

Abstract:In this paper, we consider the problem of non-linear dimensionality reduction under uncertainty, both from a theoretical and algorithmic perspectives. Since real-world data usually contain measurements with uncertainties and artifacts, the input space in the proposed framework consists of probability distributions to model the uncertainties associated with each sample. We propose a new dimensionality reduction framework, called NGEU, which leverages uncertainty information and directly extends several traditional approaches, e.g., KPCA, MDA/KMFA, to receive as inputs the probability distributions instead of the original data. We show that the proposed NGEU formulation exhibits a global closed-form solution, and we analyze, based on the Rademacher complexity, how the underlying uncertainties theoretically affect the generalization ability of the framework. Empirical results on different datasets show the effectiveness of the proposed framework.

* 10 pages, 3 figures

Via

Access Paper or Ask Questions