Abstract:The signature kernel is a kernel between time series of arbitrary length and comes with strong theoretical guarantees from stochastic analysis. It has found applications in machine learning such as covariance functions for Gaussian processes. A strength of the underlying signature features is that they provide a structured global description of a time series. However, this property can quickly become a curse when local information is essential and forgetting is required; so far this has only been addressed with ad-hoc methods such as slicing the time series into subsegments. To overcome this, we propose a principled, data-driven approach by introducing a novel forgetting mechanism for signatures. This allows the model to dynamically adapt its context length to focus on more recent information. To achieve this, we revisit the recently introduced Random Fourier Signature Features, and develop Random Fourier Decayed Signature Features (RFDSF) with Gaussian processes (GPs). This results in a Bayesian time series forecasting algorithm with variational inference, that offers a scalable probabilistic algorithm that processes and transforms a time series into a joint predictive distribution over time steps in one pass using recurrence. For example, processing a sequence of length $10^4$ steps in $\approx 10^{-2}$ seconds and in $< 1\text{GB}$ of GPU memory. We demonstrate that it outperforms other GP-based alternatives and competes with state-of-the-art probabilistic time series forecasting algorithms.