Abstract:Multi-task learning (MTL) aims to improve estimation and prediction performance by sharing common information among related tasks. One natural assumption in MTL is that tasks are classified into clusters based on their characteristics. However, existing MTL methods based on this assumption often ignore outlier tasks that have large task-specific components or no relation to other tasks. To address this issue, we propose a novel MTL method called Multi-Task Learning via Robust Regularized Clustering (MTLRRC). MTLRRC incorporates robust regularization terms inspired by robust convex clustering, which is further extended to handle non-convex and group-sparse penalties. The extension allows MTLRRC to simultaneously perform robust task clustering and outlier task detection. The connection between the extended robust clustering and the multivariate M-estimator is also established. This provides an interpretation of the robustness of MTLRRC against outlier tasks. An efficient algorithm based on a modified alternating direction method of multipliers is developed for the estimation of the parameters. The effectiveness of MTLRRC is demonstrated through simulation studies and application to real data.
Abstract:Multi-task learning (MTL) is a methodology that aims to improve the general performance of estimation and prediction by sharing common information among related tasks. In the MTL, there are several assumptions for the relationships and methods to incorporate them. One of the natural assumptions in the practical situation is that tasks are classified into some clusters with their characteristics. For this assumption, the group fused regularization approach performs clustering of the tasks by shrinking the difference among tasks. This enables us to transfer common information within the same cluster. However, this approach also transfers the information between different clusters, which worsens the estimation and prediction. To overcome this problem, we propose an MTL method with a centroid parameter representing a cluster center of the task. Because this model separates parameters into the parameters for regression and the parameters for clustering, we can improve estimation and prediction accuracy for regression coefficient vectors. We show the effectiveness of the proposed method through Monte Carlo simulations and applications to real data.
Abstract:The multivariate regression model basically offers the analysis of a single dataset with multiple responses. However, such a single-dataset analysis often leads to unsatisfactory results. Integrative analysis is an effective method to pool useful information from multiple independent datasets and provides better performance than single-dataset analysis. In this study, we propose a multivariate regression modeling in integrative analysis. The integration is achieved by sparse estimation that performs variable and group selection. Based on the idea of alternating direction method of multipliers, we develop its computational algorithm that enjoys the convergence property. The performance of the proposed method is demonstrated through Monte Carlo simulation and analyzing wastewater treatment data with microbe measurements.
Abstract:Network lasso is a method for solving a multi-task learning problem through the regularized maximum likelihood method. A characteristic of network lasso is setting a different model for each sample. The relationships among the models are represented by relational coefficients. A crucial issue in network lasso is to provide appropriate values for these relational coefficients. In this paper, we propose a Bayesian approach to solve multi-task learning problems by network lasso. This approach allows us to objectively determine the relational coefficients by Bayesian estimation. The effectiveness of the proposed method is shown in a simulation study and a real data analysis.
Abstract:We consider the problem of extracting a common structure from multiple tensor datasets. To obtain the common structure from the multiple tensor datasets, we propose multilinear common component analysis (MCCA) based on Kronecker products of mode-wise covariance matrices. MCCA constructs the common basis represented by linear combinations of original variables without losing the information of multiple tensor datasets as possible. We also develop an estimation algorithm of MCCA that guarantees mode-wise global convergence. The numerical studies are conducted to show the effectiveness of MCCA.
Abstract:In linear regression models, a fusion of the coefficients is used to identify the predictors having similar relationships with the response. This is called variable fusion. This paper presents a novel variable fusion method in terms of Bayesian linear regression models. We focus on hierarchical Bayesian models based on a spike-and-slab prior approach. A spike-and-slab prior is designed to perform variable fusion. To obtain estimates of parameters, we develop a Gibbs sampler for the parameters. Simulation studies and a real data analysis show that our proposed method has better performances than previous methods.
Abstract:Principal component regression (PCR) is a two-stage procedure: the first stage performs principal component analysis (PCA) and the second stage constructs a regression model whose explanatory variables are replaced by principal components obtained by the first stage. Since PCA is performed by using only explanatory variables, the principal components have no information about the response variable. To address the problem, we propose a one-stage procedure for PCR in terms of singular value decomposition approach. Our approach is based upon two loss functions, a regression loss and a PCA loss, with sparse regularization. The proposed method enables us to obtain principal component loadings that possess information about both explanatory variables and a response variable. An estimation algorithm is developed by using alternating direction method of multipliers. We conduct numerical studies to show the effectiveness of the proposed method.
Abstract:Sparse convex clustering is to cluster observations and conduct variable selection simultaneously in the framework of convex clustering. Although the weighted $L_1$ norm as the regularization term is usually employed in the sparse convex clustering, this increases the dependence on the data and reduces the estimation accuracy if the sample size is not sufficient. To tackle these problems, this paper proposes a Bayesian sparse convex clustering via the idea of Bayesian lasso and global-local shrinkage priors. We introduce Gibbs sampling algorithms for our method using scale mixtures of normals. The effectiveness of the proposed methods is shown in simulation studies and a real data analysis.
Abstract:We consider the problem of constructing a reduced-rank regression model whose coefficient parameter is represented as a singular value decomposition with sparse singular vectors. The traditional estimation procedure for the coefficient parameter often fails when the true rank of the parameter is high. To overcome this issue, we develop an estimation algorithm with rank and variable selection via sparse regularization and manifold optimization, which enables us to obtain an accurate estimation of the coefficient parameter even if the true rank of the coefficient parameter is high. Using sparse regularization, we can also select an optimal value of the rank. We conduct Monte Carlo experiments and real data analysis to illustrate the effectiveness of our proposed method.
Abstract:Principal component regression (PCR) is a widely used two-stage procedure: principal component analysis (PCA), followed by regression in which the selected principal components are regarded as new explanatory variables in the model. Note that PCA is based only on the explanatory variables, so the principal components are not selected using the information on the response variable. In this paper, we propose a one-stage procedure for PCR in the framework of generalized linear models. The basic loss function is based on a combination of the regression loss and PCA loss. An estimate of the regression parameter is obtained as the minimizer of the basic loss function with a sparse penalty. We call the proposed method sparse principal component regression for generalized linear models (SPCR-glm). Taking the two loss function into consideration simultaneously, SPCR-glm enables us to obtain sparse principal component loadings that are related to a response variable. However, a combination of loss functions may cause a parameter identification problem, but this potential problem is avoided by virtue of the sparse penalty. Thus, the sparse penalty plays two roles in this method. The parameter estimation procedure is proposed using various update algorithms with the coordinate descent algorithm. We apply SPCR-glm to two real datasets, doctor visits data and mouse consomic strain data. SPCR-glm provides more easily interpretable principal component (PC) scores and clearer classification on PC plots than the usual PCA.