Abstract:I introduce Forecastable Component Analysis (ForeCA), a novel dimension reduction technique for temporally dependent signals. Based on a new forecastability measure, ForeCA finds an optimal transformation to separate a multivariate time series into a forecastable and an orthogonal white noise space. I present a converging algorithm with a fast eigenvector solution. Applications to financial and macro-economic time series show that ForeCA can successfully discover informative structure, which can be used for forecasting as well as classification. The R package ForeCA (http://cran.r-project.org/web/packages/ForeCA/index.html) accompanies this work and is publicly available on CRAN.
Abstract:We introduce 'mixed LICORS', an algorithm for learning nonlinear, high-dimensional dynamics from spatio-temporal data, suitable for both prediction and simulation. Mixed LICORS extends the recent LICORS algorithm (Goerg and Shalizi, 2012) from hard clustering of predictive distributions to a non-parametric, EM-like soft clustering. This retains the asymptotic predictive optimality of LICORS, but, as we show in simulations, greatly improves out-of-sample forecasts with limited data. The new method is implemented in the publicly-available R package "LICORS" (http://cran.r-project.org/web/packages/LICORS/).
Abstract:I propose a frequency domain adaptation of the Expectation Maximization (EM) algorithm to group a family of time series in classes of similar dynamic structure. It does this by viewing the magnitude of the discrete Fourier transform (DFT) of each signal (or power spectrum) as a probability density/mass function (pdf/pmf) on the unit circle: signals with similar dynamics have similar pdfs; distinct patterns have distinct pdfs. An advantage of this approach is that it does not rely on any parametric form of the dynamic structure, but can be used for non-parametric, robust and model-free classification. This new method works for non-stationary signals of similar shape as well as stationary signals with similar auto-correlation structure. Applications to neural spike sorting (non-stationary) and pattern-recognition in socio-economic time series (stationary) demonstrate the usefulness and wide applicability of the proposed method.