Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Learning Sparse Classifiers: Continuous and Mixed Integer Optimization Perspectives

Jan 17, 2020

Antoine Dedieu, Hussein Hazimeh, Rahul Mazumder

Figure 1 for Learning Sparse Classifiers: Continuous and Mixed Integer Optimization Perspectives

Figure 2 for Learning Sparse Classifiers: Continuous and Mixed Integer Optimization Perspectives

Figure 3 for Learning Sparse Classifiers: Continuous and Mixed Integer Optimization Perspectives

Figure 4 for Learning Sparse Classifiers: Continuous and Mixed Integer Optimization Perspectives

Share this with someone who'll enjoy it:

Abstract:We consider a discrete optimization based approach for learning sparse classifiers, where the outcome depends upon a linear combination of a small subset of features. Recent work has shown that mixed integer programming (MIP) can be used to solve (to optimality) $\ell_0$-regularized problems at scales much larger than what was conventionally considered possible in the statistics and machine learning communities. Despite their usefulness, MIP-based approaches are significantly slower compared to relatively mature algorithms based on $\ell_1$-regularization and relatives. We aim to bridge this computational gap by developing new MIP-based algorithms for $\ell_0$-regularized classification. We propose two classes of scalable algorithms: an exact algorithm that can handle $p\approx 50,000$ features in a few minutes, and approximate algorithms that can address instances with $p\approx 10^6$ in times comparable to fast $\ell_1$-based algorithms. Our exact algorithm is based on the novel idea of \textsl{integrality generation}, which solves the original problem (with $p$ binary variables) via a sequence of mixed integer programs that involve a small number of binary variables. Our approximate algorithms are based on coordinate descent and local combinatorial search. In addition, we present new estimation error bounds for a class of $\ell_0$-regularized estimators. Experiments on real and synthetic data demonstrate that our approach leads to models with considerably improved statistical performance (especially, variable selection) when compared to competing toolkits.

View paper on

Share this with someone who'll enjoy it:

Title:Learning Sparse Classifiers: Continuous and Mixed Integer Optimization Perspectives

Paper and Code