Abstract:This paper investigates a new online learning problem with doubly-streaming data, where the data streams are described by feature spaces that constantly evolve, with new features emerging and old features fading away. The challenges of this problem are two folds: 1) Data samples ceaselessly flowing in may carry shifted patterns over time, requiring learners to update hence adapt on-the-fly. 2) Newly emerging features are described by very few samples, resulting in weak learners that tend to make error predictions. A plausible idea to overcome the challenges is to establish relationship between the pre-and-post evolving feature spaces, so that an online learner can leverage the knowledge learned from the old features to better the learning performance on the new features. Unfortunately, this idea does not scale up to high-dimensional media streams with complex feature interplay, which suffers an tradeoff between onlineness (biasing shallow learners) and expressiveness(requiring deep learners). Motivated by this, we propose a novel OLD^3S paradigm, where a shared latent subspace is discovered to summarize information from the old and new feature spaces, building intermediate feature mapping relationship. A key trait of OLD^3S is to treat the model capacity as a learnable semantics, yields optimal model depth and parameters jointly, in accordance with the complexity and non-linearity of the input data streams in an online fashion. Both theoretical analyses and empirical studies substantiate the viability and effectiveness of our proposal.
Abstract:We consider a distributed estimation of the double-penalized least squares approach for high dimensional partial linear models, where the sample with a total of $N$ data points is randomly distributed among $m$ machines and the parameters of interest are calculated by merging their $m$ individual estimators. This paper primarily focuses on the investigation of the high dimensional linear components in partial linear models, which is often of more interest. We propose a new debiased averaging estimator of parametric coefficients on the basis of each individual estimator, and establish new non-asymptotic oracle results in high dimensional and distributed settings, provided that $m\leq \sqrt{N/\log p}$ and other mild conditions are satisfied, where $p$ is the linear coefficient dimension. We also provide an experimental evaluation of the proposed method, indicating the numerical effectiveness on simulated data. Even under the classical non-distributed setting, we give the optimal rates of the parametric estimator with a looser tuning parameter limitation, which is required for our error analysis.
Abstract:The total variation-based image denoising model has been generalized and extended in numerous ways, improving its performance in different contexts. We propose a new penalty function motivated by the recent progress in the statistical literature on high-dimensional variable selection. Using a particular instantiation of the majorization-minimization algorithm, the optimization problem can be efficiently solved and the computational procedure realized is similar to the spatially adaptive total variation model. Our two-pixel image model shows theoretically that the new penalty function solves the bias problem inherent in the total variation model. The superior performance of the new penalty is demonstrated through several experiments. Our investigation is limited to "blocky" images which have small total variation.
Abstract:We propose a novel model for nonlinear dimension reduction motivated by the probabilistic formulation of principal component analysis. Nonlinearity is achieved by specifying different transformation matrices at different locations of the latent space and smoothing the transformation using a Markov random field type prior. The computation is made feasible by the recent advances in sampling from von Mises-Fisher distributions.
Abstract:Super-resolution is an important but difficult problem in image/video processing. If a video sequence or some training set other than the given low-resolution image is available, this kind of extra information can greatly aid in the reconstruction of the high-resolution image. The problem is substantially more difficult with only a single low-resolution image on hand. The image reconstruction methods designed primarily for denoising is insufficient for super-resolution problem in the sense that it tends to oversmooth images with essentially no noise. We propose a new adaptive linear interpolation method based on variational method and inspired by local linear embedding (LLE). The experimental result shows that our method avoids the problem of oversmoothing and preserves image structures well.