Abstract:Accurate time series prediction is challenging due to the inherent nonlinearity and sensitivity to initial conditions. We propose a novel approach that enhances neural network predictions through differential learning, which involves training models on both the original time series and its differential series. Specifically, we develop a differential long short-term memory (Diff-LSTM) network that uses a shared LSTM cell to simultaneously process both data streams, effectively capturing intrinsic patterns and temporal dynamics. Evaluated on the Mackey-Glass, Lorenz, and R\"ossler chaotic time series, as well as a real-world financial dataset from ACI Worldwide Inc., our results demonstrate that the Diff- LSTM network outperforms prevalent models such as recurrent neural networks, convolutional neural networks, and bidirectional and encoder-decoder LSTM networks in both short-term and long-term predictions. This framework offers a promising solution for enhancing time series prediction, even when comprehensive knowledge of the underlying dynamics of the time series is not fully available.
Abstract:We study a continuous-time approximation of the stochastic gradient descent process for minimizing the expected loss in learning problems. The main results establish general sufficient conditions for the convergence, extending the results of Chatterjee (2022) established for (nonstochastic) gradient descent. We show how the main result can be applied to the case of overparametrized linear neural network training.