Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Insung Kong

Fair Representation Learning for Continuous Sensitive Attributes using Expectation of Integral Probability Metrics

May 09, 2025

Insung Kong, Kunwoong Kim, Yongdai Kim

Abstract:AI fairness, also known as algorithmic fairness, aims to ensure that algorithms operate without bias or discrimination towards any individual or group. Among various AI algorithms, the Fair Representation Learning (FRL) approach has gained significant interest in recent years. However, existing FRL algorithms have a limitation: they are primarily designed for categorical sensitive attributes and thus cannot be applied to continuous sensitive attributes, such as age or income. In this paper, we propose an FRL algorithm for continuous sensitive attributes. First, we introduce a measure called the Expectation of Integral Probability Metrics (EIPM) to assess the fairness level of representation space for continuous sensitive attributes. We demonstrate that if the distribution of the representation has a low EIPM value, then any prediction head constructed on the top of the representation become fair, regardless of the selection of the prediction head. Furthermore, EIPM possesses a distinguished advantage in that it can be accurately estimated using our proposed estimator with finite samples. Based on these properties, we propose a new FRL algorithm called Fair Representation using EIPM with MMD (FREM). Experimental evidences show that FREM outperforms other baseline methods.

* 42 pages, 30 figures. IEEE Transactions on Pattern Analysis and Machine Intelligence (2025)

Via

Access Paper or Ask Questions

On the expressivity of deep Heaviside networks

Apr 30, 2025

Insung Kong, Juntong Chen, Sophie Langer, Johannes Schmidt-Hieber

Abstract:We show that deep Heaviside networks (DHNs) have limited expressiveness but that this can be overcome by including either skip connections or neurons with linear activation. We provide lower and upper bounds for the Vapnik-Chervonenkis (VC) dimensions and approximation rates of these network classes. As an application, we derive statistical convergence rates for DHN fits in the nonparametric regression model.

* 61 pages, 16 figures

Via

Access Paper or Ask Questions

ReLU integral probability metric and its applications

Apr 26, 2025

Yuha Park, Kunwoong Kim, Insung Kong, Yongdai Kim

Abstract:We propose a parametric integral probability metric (IPM) to measure the discrepancy between two probability measures. The proposed IPM leverages a specific parametric family of discriminators, such as single-node neural networks with ReLU activation, to effectively distinguish between distributions, making it applicable in high-dimensional settings. By optimizing over the parameters of the chosen discriminator class, the proposed IPM demonstrates that its estimators have good convergence rates and can serve as a surrogate for other IPMs that use smooth nonparametric discriminator classes. We present an efficient algorithm for practical computation, offering a simple implementation and requiring fewer hyperparameters. Furthermore, we explore its applications in various tasks, such as covariate balancing for causal inference and fair representation learning. Across such diverse applications, we demonstrate that the proposed IPM provides strong theoretical guarantees, and empirical experiments show that it achieves comparable or even superior performance to other methods.

* 49 pages, 9 figures

Via

Access Paper or Ask Questions

Tensor Product Neural Networks for Functional ANOVA Model

Feb 24, 2025

Seokhun Park, Insung Kong, Yongchan Choi, Chanmoo Park, Yongdai Kim

Abstract:Interpretability for machine learning models is becoming more and more important as machine learning models become more complex. The functional ANOVA model, which decomposes a high-dimensional function into a sum of lower dimensional functions so called components, is one of the most popular tools for interpretable AI, and recently, various neural network models have been developed for estimating each component in the functional ANOVA model. However, such neural networks are highly unstable when estimating components since the components themselves are not uniquely defined. That is, there are multiple functional ANOVA decompositions for a given function. In this paper, we propose a novel interpretable model which guarantees a unique functional ANOVA decomposition and thus is able to estimate each component stably. We call our proposed model ANOVA-NODE since it is a modification of Neural Oblivious Decision Ensembles (NODE) for the functional ANOVA model. Theoretically, we prove that ANOVA-NODE can approximate a smooth function well. Additionally, we experimentally show that ANOVA-NODE provides much more stable estimation of each component and thus much more stable interpretation when training data and initial values of the model parameters vary than existing neural network models do.

* 45 pages

Via

Access Paper or Ask Questions

Fairness Through Matching

Jan 06, 2025

Kunwoong Kim, Insung Kong, Jongjin Lee, Minwoo Chae, Sangchul Park, Yongdai Kim

Abstract:Group fairness requires that different protected groups, characterized by a given sensitive attribute, receive equal outcomes overall. Typically, the level of group fairness is measured by the statistical gap between predictions from different protected groups. In this study, we reveal an implicit property of existing group fairness measures, which provides an insight into how the group-fair models behave. Then, we develop a new group-fair constraint based on this implicit property to learn group-fair models. To do so, we first introduce a notable theoretical observation: every group-fair model has an implicitly corresponding transport map between the input spaces of each protected group. Based on this observation, we introduce a new group fairness measure termed Matched Demographic Parity (MDP), which quantifies the averaged gap between predictions of two individuals (from different protected groups) matched by a given transport map. Then, we prove that any transport map can be used in MDP to learn group-fair models, and develop a novel algorithm called Fairness Through Matching (FTM), which learns a group-fair model using MDP constraint with an user-specified transport map. We specifically propose two favorable types of transport maps for MDP, based on the optimal transport theory, and discuss their advantages. Experiments reveal that FTM successfully trains group-fair models with certain desirable properties by choosing the transport map accordingly.

* Published in TMLR

Via

Access Paper or Ask Questions

Posterior concentrations of fully-connected Bayesian neural networks with general priors on the weights

Mar 21, 2024

Insung Kong, Yongdai Kim

Abstract:Bayesian approaches for training deep neural networks (BNNs) have received significant interest and have been effectively utilized in a wide range of applications. There have been several studies on the properties of posterior concentrations of BNNs. However, most of these studies only demonstrate results in BNN models with sparse or heavy-tailed priors. Surprisingly, no theoretical results currently exist for BNNs using Gaussian priors, which are the most commonly used one. The lack of theory arises from the absence of approximation results of Deep Neural Networks (DNNs) that are non-sparse and have bounded parameters. In this paper, we present a new approximation theory for non-sparse DNNs with bounded parameters. Additionally, based on the approximation theory, we show that BNNs with non-sparse general priors can achieve near-minimax optimal posterior concentration rates to the true model.

Via

Access Paper or Ask Questions

Enhancing Adversarial Robustness in Low-Label Regime via Adaptively Weighted Regularization and Knowledge Distillation

Aug 08, 2023

Dongyoon Yang, Insung Kong, Yongdai Kim

Abstract:Adversarial robustness is a research area that has recently received a lot of attention in the quest for trustworthy artificial intelligence. However, recent works on adversarial robustness have focused on supervised learning where it is assumed that labeled data is plentiful. In this paper, we investigate semi-supervised adversarial training where labeled data is scarce. We derive two upper bounds for the robust risk and propose a regularization term for unlabeled data motivated by these two upper bounds. Then, we develop a semi-supervised adversarial training algorithm that combines the proposed regularization term with knowledge distillation using a semi-supervised teacher (i.e., a teacher model trained using a semi-supervised learning algorithm). Our experiments show that our proposed algorithm achieves state-of-the-art performance with significant margins compared to existing algorithms. In particular, compared to supervised learning algorithms, performance of our proposed algorithm is not much worse even when the amount of labeled data is very small. For example, our algorithm with only 8\% labeled data is comparable to supervised adversarial training algorithms that use all labeled data, both in terms of standard and robust accuracies on CIFAR-10.

* 9 pages - Manuscript, 6 pages - Appendix, Accepted in ICCV 2023

Via

Access Paper or Ask Questions

Masked Bayesian Neural Networks : Theoretical Guarantee and its Posterior Inference

May 24, 2023

Insung Kong, Dongyoon Yang, Jongjin Lee, Ilsang Ohn, Gyuseung Baek, Yongdai Kim

Abstract:Bayesian approaches for learning deep neural networks (BNN) have been received much attention and successfully applied to various applications. Particularly, BNNs have the merit of having better generalization ability as well as better uncertainty quantification. For the success of BNN, search an appropriate architecture of the neural networks is an important task, and various algorithms to find good sparse neural networks have been proposed. In this paper, we propose a new node-sparse BNN model which has good theoretical properties and is computationally feasible. We prove that the posterior concentration rate to the true model is near minimax optimal and adaptive to the smoothness of the true model. In particular the adaptiveness is the first of its kind for node-sparse BNNs. In addition, we develop a novel MCMC algorithm which makes the Bayesian inference of the node-sparse BNN model feasible in practice.

* 30 pages, ICML 2023 proceedings. arXiv admin note: substantial text overlap with arXiv:2206.00853

Via

Access Paper or Ask Questions

Covariate balancing using the integral probability metric for causal inference

May 23, 2023

Insung Kong, Yuha Park, Joonhyuk Jung, Kwonsang Lee, Yongdai Kim

Abstract:Weighting methods in causal inference have been widely used to achieve a desirable level of covariate balancing. However, the existing weighting methods have desirable theoretical properties only when a certain model, either the propensity score or outcome regression model, is correctly specified. In addition, the corresponding estimators do not behave well for finite samples due to large variance even when the model is correctly specified. In this paper, we consider to use the integral probability metric (IPM), which is a metric between two probability measures, for covariate balancing. Optimal weights are determined so that weighted empirical distributions for the treated and control groups have the smallest IPM value for a given set of discriminators. We prove that the corresponding estimator can be consistent without correctly specifying any model (neither the propensity score nor the outcome regression model). In addition, we empirically show that our proposed method outperforms existing weighting methods with large margins for finite samples.

* 32 pages, ICML 2023 proceedings

Via

Access Paper or Ask Questions

Adaptive Regularization for Adversarial Training

Jun 07, 2022

Dongyoon Yang, Insung Kong, Yongdai Kim

Figure 1 for Adaptive Regularization for Adversarial Training

Figure 2 for Adaptive Regularization for Adversarial Training

Figure 3 for Adaptive Regularization for Adversarial Training

Figure 4 for Adaptive Regularization for Adversarial Training

Abstract:Adversarial training, which is to enhance robustness against adversarial attacks, has received much attention because it is easy to generate human-imperceptible perturbations of data to deceive a given deep neural network. In this paper, we propose a new adversarial training algorithm that is theoretically well motivated and empirically superior to other existing algorithms. A novel feature of the proposed algorithm is to use a data-adaptive regularization for robustifying a prediction model. We apply more regularization to data which are more vulnerable to adversarial attacks and vice versa. Even though the idea of data-adaptive regularization is not new, our data-adaptive regularization has a firm theoretical base of reducing an upper bound of the robust risk. Numerical experiments illustrate that our proposed algorithm improves the generalization (accuracy on clean samples) and robustness (accuracy on adversarial attacks) simultaneously to achieve the state-of-the-art performance.

Via

Access Paper or Ask Questions