Abstract:This paper establishes a novel role for Gaussian-mixture models (GMMs) as functional approximators of Q-function losses in reinforcement learning (RL). Unlike the existing RL literature, where GMMs play their typical role as estimates of probability density functions, GMMs approximate here Q-function losses. The new Q-function approximators, coined GMM-QFs, are incorporated in Bellman residuals to promote a Riemannian-optimization task as a novel policy-evaluation step in standard policy-iteration schemes. The paper demonstrates how the hyperparameters (means and covariance matrices) of the Gaussian kernels are learned from the data, opening thus the door of RL to the powerful toolbox of Riemannian optimization. Numerical tests show that with no use of experienced data, the proposed design outperforms state-of-the-art methods, even deep Q-networks which use experienced data, on benchmark RL tasks.
Abstract:This paper extends the recently developed framework of multilinear kernel regression and imputation via manifold learning (MultiL-KRIM) to impute time-varying edge flows in a graph. MultiL-KRIM uses simplicial-complex arguments and Hodge Laplacians to incorporate the graph topology, and exploits manifold-learning arguments to identify latent geometries within features which are modeled as a point-cloud around a smooth manifold embedded in a reproducing kernel Hilbert space (RKHS). Following the concept of tangent spaces to smooth manifolds, linear approximating patches are used to add a collaborative-filtering flavor to the point-cloud approximations. Together with matrix factorizations, MultiL-KRIM effects dimensionality reduction, and enables efficient computations, without any training data or additional information. Numerical tests on real-network time-varying edge flows demonstrate noticeable improvements of MultiL-KRIM over several state-of-the-art schemes.
Abstract:This paper designs novel nonparametric Bellman mappings in reproducing kernel Hilbert spaces (RKHSs) for reinforcement learning (RL). The proposed mappings benefit from the rich approximating properties of RKHSs, adopt no assumptions on the statistics of the data owing to their nonparametric nature, require no knowledge on transition probabilities of Markov decision processes, and may operate without any training data. Moreover, they allow for sampling on-the-fly via the design of trajectory samples, re-use past test data via experience replay, effect dimensionality reduction by random Fourier features, and enable computationally lightweight operations to fit into efficient online or time-adaptive learning. The paper offers also a variational framework to design the free parameters of the proposed Bellman mappings, and shows that appropriate choices of those parameters yield several popular Bellman-mapping designs. As an application, the proposed mappings are employed to offer a novel solution to the problem of countering outliers in adaptive filtering. More specifically, with no prior information on the statistics of the outliers and no training data, a policy-iteration algorithm is introduced to select online, per time instance, the ``optimal'' coefficient p in the least-mean-p-power-error method. Numerical tests on synthetic data showcase, in most of the cases, the superior performance of the proposed solution over several RL and non-RL schemes.
Abstract:This paper introduces a novel nonparametric framework for data imputation, coined multilinear kernel regression and imputation via the manifold assumption (MultiL-KRIM). Motivated by manifold learning, MultiL-KRIM models data features as a point cloud located in or close to a user-unknown smooth manifold embedded in a reproducing kernel Hilbert space. Unlike typical manifold-learning routes, which seek low-dimensional patterns via regularizers based on graph-Laplacian matrices, MultiL-KRIM builds instead on the intuitive concept of tangent spaces to manifolds and incorporates collaboration among point-cloud neighbors (regressors) directly into the data-modeling term of the loss function. Multiple kernel functions are allowed to offer robustness and rich approximation properties, while multiple matrix factors offer low-rank modeling, integrate dimensionality reduction, and streamline computations with no need of training data. Two important application domains showcase the functionality of MultiL-KRIM: time-varying-graph-signal (TVGS) recovery, and reconstruction of highly accelerated dynamic-magnetic-resonance-imaging (dMRI) data. Extensive numerical tests on real and synthetic data demonstrate MultiL-KRIM's remarkable speedups over its predecessors, and outperformance over prevalent "shallow" data-imputation techniques, with a more intuitive and explainable pipeline than deep-image-prior methods.
Abstract:This paper aims at the algorithmic/theoretical core of reinforcement learning (RL) by introducing the novel class of proximal Bellman mappings. These mappings are defined in reproducing kernel Hilbert spaces (RKHSs), to benefit from the rich approximation properties and inner product of RKHSs, they are shown to belong to the powerful Hilbertian family of (firmly) nonexpansive mappings, regardless of the values of their discount factors, and possess ample degrees of design freedom to even reproduce attributes of the classical Bellman mappings and to pave the way for novel RL designs. An approximate policy-iteration scheme is built on the proposed class of mappings to solve the problem of selecting online, at every time instance, the "optimal" exponent $p$ in a $p$-norm loss to combat outliers in linear adaptive filtering, without training data and any knowledge on the statistical properties of the outliers. Numerical tests on synthetic data showcase the superior performance of the proposed framework over several non-RL and kernel-based RL schemes.
Abstract:This paper introduces an efficient multi-linear nonparametric (kernel-based) approximation framework for data regression and imputation, and its application to dynamic magnetic-resonance imaging (dMRI). Data features are assumed to reside in or close to a smooth manifold embedded in a reproducing kernel Hilbert space. Landmark points are identified to describe concisely the point cloud of features by linear approximating patches which mimic the concept of tangent spaces to smooth manifolds. The multi-linear model effects dimensionality reduction, enables efficient computations, and extracts data patterns and their geometry without any training data or additional information. Numerical tests on dMRI data under severe under-sampling demonstrate remarkable improvements in efficiency and accuracy of the proposed approach over its predecessors, popular data modeling methods, as well as recent tensor-based and deep-image-prior schemes.
Abstract:This study addresses the problem of selecting dynamically, at each time instance, the ``optimal'' p-norm to combat outliers in linear adaptive filtering without any knowledge on the potentially time-varying probability distribution function of the outliers. To this end, an online and data-driven framework is designed via kernel-based reinforcement learning (KBRL). Novel Bellman mappings on reproducing kernel Hilbert spaces (RKHSs) are introduced that need no knowledge on transition probabilities of Markov decision processes, and are nonexpansive with respect to the underlying Hilbertian norm. An approximate policy-iteration framework is finally offered via the introduction of a finite-dimensional affine superset of the fixed-point set of the proposed Bellman mappings. The well-known ``curse of dimensionality'' in RKHSs is addressed by building a basis of vectors via an approximate linear dependency criterion. Numerical tests on synthetic data demonstrate that the proposed framework selects always the ``optimal'' p-norm for the outlier scenario at hand, outperforming at the same time several non-RL and KBRL schemes.
Abstract:This paper establishes a kernel-based framework for reconstructing data on manifolds, tailored to fit the dynamic-(d)MRI-data recovery problem. The proposed methodology exploits simple tangent-space geometries of manifolds in reproducing kernel Hilbert spaces and follows classical kernel-approximation arguments to form the data-recovery task as a bi-linear inverse problem. Departing from mainstream approaches, the proposed methodology uses no training data, employs no graph Laplacian matrix to penalize the optimization task, uses no costly (kernel) pre-imaging step to map feature points back to the input space, and utilizes complex-valued kernel functions to account for k-space data. The framework is validated on synthetically generated dMRI data, where comparisons against state-of-the-art schemes highlight the rich potential of the proposed approach in data-recovery problems.
Abstract:This paper introduces a clustering framework for networks with nodes annotated with time-series data. The framework addresses all types of network-clustering problems: State clustering, node clustering within states (a.k.a. topology identification or community detection), and even subnetwork-state-sequence identification/tracking. Via a bottom-up approach, features are first extracted from the raw nodal time-series data by kernel autoregressive-moving-average modeling to reveal non-linear dependencies and low-rank representations, and then mapped onto the Grassmann manifold (Grassmannian). All clustering tasks are performed by leveraging the underlying Riemannian geometry of the Grassmannian in a novel way. To validate the proposed framework, brain-network clustering is considered, where extensive numerical tests on synthetic and real functional magnetic resonance imaging (fMRI) data demonstrate that the advocated learning framework compares favorably versus several state-of-the-art clustering schemes.
Abstract:This paper fortifies the recently introduced hierarchical-optimization recursive least squares (HO-RLS) against outliers which contaminate infrequently linear-regression models. Outliers are modeled as nuisance variables and are estimated together with the linear filter/system variables via a sparsity-inducing (non-)convexly regularized least-squares task. The proposed outlier-robust HO-RLS builds on steepest-descent directions with a constant step size (learning rate), needs no matrix inversion (lemma), accommodates colored nominal noise of known correlation matrix, exhibits small computational footprint, and offers theoretical guarantees, in a probabilistic sense, for the convergence of the system estimates to the solutions of a hierarchical-optimization problem: Minimize a convex loss, which models a-priori knowledge about the unknown system, over the minimizers of the classical ensemble LS loss. Extensive numerical tests on synthetically generated data in both stationary and non-stationary scenarios showcase notable improvements of the proposed scheme over state-of-the-art techniques.