Abstract:Algorithmic fairness has become a central concern in computational decision-making systems, where ensuring equitable outcomes is essential for both ethical and legal reasons. Two dominant notions of fairness have emerged in the literature: Group Fairness (GF), which focuses on mitigating disparities across demographic subpopulations, and Individual Fairness (IF), which emphasizes consistent treatment of similar individuals. These notions have traditionally been studied in isolation. In contrast, this survey examines methods that jointly address GF and IF, integrating both perspectives within unified frameworks and explicitly characterizing the trade-offs between them. We provide a systematic and critical review of hybrid fairness approaches, organizing existing methods according to the fairness mechanisms they employ and the algorithmic and mathematical strategies used to reconcile multiple fairness criteria. For each class of methods, we examine their theoretical foundations, optimization mechanisms, and empirical evaluation practices, and discuss their limitations. Additionally, we discuss the challenges and identify open research directions for developing principled, context-aware hybrid fairness methods. By synthesizing insights across the literature, this survey aims to serve as a comprehensive resource for researchers and practitioners seeking to design hybrid algorithms that provide reliable fairness guarantees at both the individual and group levels.
Abstract:Machine Learning algorithms are ubiquitous in key decision-making contexts such as justice, healthcare and finance, which has spawned a great demand for fairness in these procedures. However, the theoretical properties of such models in relation with fairness are still poorly understood, and the intuition behind the relationship between group and individual fairness is still lacking. In this paper, we provide a theoretical framework based on Sheaf Diffusion to leverage tools based on dynamical systems and homology to model fairness. Concretely, the proposed method projects input data into a bias-free space that encodes fairness constrains, resulting in fair solutions. Furthermore, we present a collection of network topologies handling different fairness metrics, leading to a unified method capable of dealing with both individual and group bias. The resulting models have a layer of interpretability in the form of closed-form expressions for their SHAP values, consolidating their place in the responsible Artificial Intelligence landscape. Finally, these intuitions are tested on a simulation study and standard fairness benchmarks, where the proposed methods achieve satisfactory results. More concretely, the paper showcases the performance of the proposed models in terms of accuracy and fairness, studying available trade-offs on the Pareto frontier, checking the effects of changing the different hyper-parameters, and delving into the interpretation of its outputs.
Abstract:Machine Learning algorithms are ubiquitous in key decision-making contexts such as organizational justice or healthcare, which has spawned a great demand for fairness in these procedures. In this paper we focus on the application of fair ML in finance, more concretely on the use of fairness techniques on credit scoring. This paper makes two contributions. On the one hand, it addresses the existent gap concerning the application of established methods in the literature to the case of multiple sensitive variables through the use of a new technique called logical processors (LP). On the other hand, it also explores the novel method of multistage processors (MP) to investigate whether the combination of fairness methods can work synergistically to produce solutions with improved fairness or accuracy. Furthermore, we examine the intersection of these two lines of research by exploring the integration of fairness methods in the multivariate case. The results are very promising and suggest that logical processors are an appropriate way of handling multiple sensitive variables. Furthermore, multistage processors are capable of improving the performance of existing methods.
Abstract:Feature Selection is a crucial procedure in Data Science tasks such as Classification, since it identifies the relevant variables, making thus the classification procedures more interpretable, cheaper in terms of measurement and more effective by reducing noise and data overfit. The relevance of features in a classification procedure is linked to the fact that misclassifications costs are frequently asymmetric, since false positive and false negative cases may have very different consequences. However, off-the-shelf Feature Selection procedures seldom take into account such cost-sensitivity of errors. In this paper we propose a mathematical-optimization-based Feature Selection procedure embedded in one of the most popular classification procedures, namely, Support Vector Machines, accommodating asymmetric misclassification costs. The key idea is to replace the traditional margin maximization by minimizing the number of features selected, but imposing upper bounds on the false positive and negative rates. The problem is written as an integer linear problem plus a quadratic convex problem for Support Vector Machines with both linear and radial kernels. The reported numerical experience demonstrates the usefulness of the proposed Feature Selection procedure. Indeed, our results on benchmark data sets show that a substantial decrease of the number of features is obtained, whilst the desired trade-off between false positive and false negative rates is achieved.
Abstract:Support Vector Machine (SVM) is a powerful tool in binary classification, known to attain excellent misclassification rates. On the other hand, many realworld classification problems, such as those found in medical diagnosis, churn or fraud prediction, involve misclassification costs which may be different in the different classes. However, it may be hard for the user to provide precise values for such misclassification costs, whereas it may be much easier to identify acceptable misclassification rates values. In this paper we propose a novel SVM model in which misclassification costs are considered by incorporating performance constraints in the problem formulation. Specifically, our aim is to seek the hyperplane with maximal margin yielding misclassification rates below given threshold values. Such maximal margin hyperplane is obtained by solving a quadratic convex problem with linear constraints and integer variables. The reported numerical experience shows that our model gives the user control on the misclassification rates in one class (possibly at the expense of an increase in misclassification rates for the other class) and is feasible in terms of running times.
Abstract:Support vector machines (SVMs) are widely used and constitute one of the best examined and used machine learning models for two-class classification. Classification in SVM is based on a score procedure, yielding a deterministic classification rule, which can be transformed into a probabilistic rule (as implemented in off-the-shelf SVM libraries), but is not probabilistic in nature. On the other hand, the tuning of the regularization parameters in SVM is known to imply a high computational effort and generates pieces of information that are not fully exploited, not being used to build a probabilistic classification rule. In this paper we propose a novel approach to generate probabilistic outputs for the SVM. The new method has the following three properties. First, it is designed to be cost-sensitive, and thus the different importance of sensitivity (or true positive rate, TPR) and specificity (true negative rate, TNR) is readily accommodated in the model. As a result, the model can deal with imbalanced datasets which are common in operational business problems as churn prediction or credit scoring. Second, the SVM is embedded in an ensemble method to improve its performance, making use of the valuable information generated in the parameters tuning process. Finally, the probabilities estimation is done via bootstrap estimates, avoiding the use of parametric models as competing approaches. Numerical tests on a wide range of datasets show the advantages of our approach over benchmark procedures.