Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Hyungjun Joo

Constructing Fair Latent Space for Intersection of Fairness and Explainability

Dec 23, 2024

Hyungjun Joo, Hyeonggeun Han, Sehwan Kim, Sangwoo Hong, Jungwoo Lee

Figure 1 for Constructing Fair Latent Space for Intersection of Fairness and Explainability

Figure 2 for Constructing Fair Latent Space for Intersection of Fairness and Explainability

Figure 3 for Constructing Fair Latent Space for Intersection of Fairness and Explainability

Figure 4 for Constructing Fair Latent Space for Intersection of Fairness and Explainability

Abstract:As the use of machine learning models has increased, numerous studies have aimed to enhance fairness. However, research on the intersection of fairness and explainability remains insufficient, leading to potential issues in gaining the trust of actual users. Here, we propose a novel module that constructs a fair latent space, enabling faithful explanation while ensuring fairness. The fair latent space is constructed by disentangling and redistributing labels and sensitive attributes, allowing the generation of counterfactual explanations for each type of information. Our module is attached to a pretrained generative model, transforming its biased latent space into a fair latent space. Additionally, since only the module needs to be trained, there are advantages in terms of time and cost savings, without the need to train the entire generative model. We validate the fair latent space with various fairness metrics and demonstrate that our approach can effectively provide explanations for biased decisions and assurances of fairness.

* 14 pages, 5 figures, accepted in AAAI 2025

Via

Access Paper or Ask Questions

Mitigating Spurious Correlations via Disagreement Probability

Nov 04, 2024

Hyeonggeun Han, Sehwan Kim, Hyungjun Joo, Sangwoo Hong, Jungwoo Lee

Figure 1 for Mitigating Spurious Correlations via Disagreement Probability

Figure 2 for Mitigating Spurious Correlations via Disagreement Probability

Figure 3 for Mitigating Spurious Correlations via Disagreement Probability

Figure 4 for Mitigating Spurious Correlations via Disagreement Probability

Abstract:Models trained with empirical risk minimization (ERM) are prone to be biased towards spurious correlations between target labels and bias attributes, which leads to poor performance on data groups lacking spurious correlations. It is particularly challenging to address this problem when access to bias labels is not permitted. To mitigate the effect of spurious correlations without bias labels, we first introduce a novel training objective designed to robustly enhance model performance across all data samples, irrespective of the presence of spurious correlations. From this objective, we then derive a debiasing method, Disagreement Probability based Resampling for debiasing (DPR), which does not require bias labels. DPR leverages the disagreement between the target label and the prediction of a biased model to identify bias-conflicting samples-those without spurious correlations-and upsamples them according to the disagreement probability. Empirical evaluations on multiple benchmarks demonstrate that DPR achieves state-of-the-art performance over existing baselines that do not use bias labels. Furthermore, we provide a theoretical analysis that details how DPR reduces dependency on spurious correlations.

Via

Access Paper or Ask Questions