Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Esther Xu Fei

Benchmarking optimality of time series classification methods in distinguishing diffusions

Feb 05, 2023

Zehong Zhang, Fei Lu, Esther Xu Fei, Terry Lyons, Yannis Kevrekidis, Tom Woolf

Figure 1 for Benchmarking optimality of time series classification methods in distinguishing diffusions

Figure 2 for Benchmarking optimality of time series classification methods in distinguishing diffusions

Figure 3 for Benchmarking optimality of time series classification methods in distinguishing diffusions

Figure 4 for Benchmarking optimality of time series classification methods in distinguishing diffusions

Abstract:Performance benchmarking is a crucial component of time series classification (TSC) algorithm design, and a fast-growing number of datasets have been established for empirical benchmarking. However, the empirical benchmarks are costly and do not guarantee statistical optimality. This study proposes to benchmark the optimality of TSC algorithms in distinguishing diffusion processes by the likelihood ratio test (LRT). The LRT is optimal in the sense of the Neyman-Pearson lemma: it has the smallest false positive rate among classifiers with a controlled level of false negative rate. The LRT requires the likelihood ratio of the time series to be computable. The diffusion processes from stochastic differential equations provide such time series and are flexible in design for generating linear or nonlinear time series. We demonstrate the benchmarking with three scalable state-of-the-art TSC algorithms: random forest, ResNet, and ROCKET. Test results show that they can achieve LRT optimality for univariate time series and multivariate Gaussian processes. However, these model-agnostic algorithms are suboptimal in classifying nonlinear multivariate time series from high-dimensional stochastic interacting particle systems. Additionally, the LRT benchmark provides tools to analyze the dependence of classification accuracy on the time length, dimension, temporal sampling frequency, and randomness of the time series. Thus, the LRT with diffusion processes can systematically and efficiently benchmark the optimality of TSC algorithms and may guide their future improvements.

* 21 pages, 8 figures

Via

Access Paper or Ask Questions