Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Orthant Based Proximal Stochastic Gradient Method for $\ell_1$-Regularized Optimization

Apr 07, 2020

Tianyi Chen, Tianyu Ding, Bo Ji, Guanyi Wang, Yixin Shi, Sheng Yi, Xiao Tu, Zhihui Zhu

$Figure 1 for Orthant Based Proximal Stochastic Gradient Method for $\ell_1$-Regularized Optimization$

$Figure 2 for Orthant Based Proximal Stochastic Gradient Method for $\ell_1$-Regularized Optimization$

$Figure 3 for Orthant Based Proximal Stochastic Gradient Method for $\ell_1$-Regularized Optimization$

$Figure 4 for Orthant Based Proximal Stochastic Gradient Method for $\ell_1$-Regularized Optimization$

Share this with someone who'll enjoy it:

Abstract:Sparsity-inducing regularization problems are ubiquitous in machine learning applications, ranging from feature selection to model compression. In this paper, we present a novel stochastic method -- Orthant Based Proximal Stochastic Gradient Method (OBProx-SG) -- to solve perhaps the most popular instance, i.e., the l1-regularized problem. The OBProx-SG method contains two steps: (i) a proximal stochastic gradient step to predict a support cover of the solution; and (ii) an orthant step to aggressively enhance the sparsity level via orthant face projection. Compared to the state-of-the-art methods, e.g., Prox-SG, RDA and Prox-SVRG, the OBProx-SG not only converges to the global optimal solutions (in convex scenario) or the stationary points (in non-convex scenario), but also promotes the sparsity of the solutions substantially. Particularly, on a large number of convex problems, OBProx-SG outperforms the existing methods comprehensively in the aspect of sparsity exploration and objective values. Moreover, the experiments on non-convex deep neural networks, e.g., MobileNetV1 and ResNet18, further demonstrate its superiority by achieving the solutions of much higher sparsity without sacrificing generalization accuracy.

View paper on

Share this with someone who'll enjoy it:

Title:Orthant Based Proximal Stochastic Gradient Method for $\ell_1$-Regularized Optimization

Paper and Code