Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Rafael Bassi Stern

Conditional independence testing: a predictive perspective

Jul 31, 2019

Marco Henrique de Almeida Inácio, Rafael Izbicki, Rafael Bassi Stern

Figure 1 for Conditional independence testing: a predictive perspective

Figure 2 for Conditional independence testing: a predictive perspective

Figure 3 for Conditional independence testing: a predictive perspective

Figure 4 for Conditional independence testing: a predictive perspective

Abstract:Conditional independence testing is a key problem required by many machine learning and statistics tools. In particular, it is one way of evaluating the usefulness of some features on a supervised prediction problem. We propose a novel conditional independence test in a predictive setting, and show that it achieves better power than competing approaches in several settings. Our approach consists in deriving a p-value using a permutation test where the predictive power using the unpermuted dataset is compared with the predictive power of using dataset where the feature(s) of interest are permuted. We conclude that the method achives sensible results on simulated and real datasets.

Via

Access Paper or Ask Questions

Quantification under prior probability shift: the ratio estimator and its extensions

Jul 11, 2018

Afonso Fernandes Vaz, Rafael Izbicki, Rafael Bassi Stern

Figure 1 for Quantification under prior probability shift: the ratio estimator and its extensions

Figure 2 for Quantification under prior probability shift: the ratio estimator and its extensions

Figure 3 for Quantification under prior probability shift: the ratio estimator and its extensions

Figure 4 for Quantification under prior probability shift: the ratio estimator and its extensions

Abstract:The quantification problem consists of determining the prevalence of a given label in a target population. However, one often has access to the labels in a sample from the training population but not in the target population. A common assumption in this situation is that of prior probability shift, that is, once the labels are known, the distribution of the features is the same in the training and target populations. In this paper, we derive a new lower bound for the risk of the quantification problem under the prior shift assumption. Complementing this lower bound, we present a new approximately minimax class of estimators, ratio estimators, which generalize several previous proposals in the literature. Using a weaker version of the prior shift assumption, which can be tested, we show that ratio estimators can be used to build confidence intervals for the quantification problem. We also extend the ratio estimator so that it can: (i) incorporate labels from the target population, when they are available and (ii) estimate how the prevalence of positive labels varies according to a function of certain covariates.

* 23 pages, 6 figures

Via

Access Paper or Ask Questions

Learning with many experts: model selection and sparsity

May 13, 2014

Rafael Izbicki, Rafael Bassi Stern

Figure 1 for Learning with many experts: model selection and sparsity

Figure 2 for Learning with many experts: model selection and sparsity

Figure 3 for Learning with many experts: model selection and sparsity

Figure 4 for Learning with many experts: model selection and sparsity

Abstract:Experts classifying data are often imprecise. Recently, several models have been proposed to train classifiers using the noisy labels generated by these experts. How to choose between these models? In such situations, the true labels are unavailable. Thus, one cannot perform model selection using the standard versions of methods such as empirical risk minimization and cross validation. In order to allow model selection, we present a surrogate loss and provide theoretical guarantees that assure its consistency. Next, we discuss how this loss can be used to tune a penalization which introduces sparsity in the parameters of a traditional class of models. Sparsity provides more parsimonious models and can avoid overfitting. Nevertheless, it has seldom been discussed in the context of noisy labels due to the difficulty in model selection and, therefore, in choosing tuning parameters. We apply these techniques to several sets of simulated and real data.

* Izbicki, R., Stern, R. B. "Learning with many experts: Model selection and sparsity." Statistical Analysis and Data Mining 6.6 (2013): 565-577
* This is the pre-peer reviewed version

Via

Access Paper or Ask Questions