Abstract:aeon is a unified Python 3 library for all machine learning tasks involving time series. The package contains modules for time series forecasting, classification, extrinsic regression and clustering, as well as a variety of utilities, transformations and distance measures designed for time series data. aeon also has a number of experimental modules for tasks such as anomaly detection, similarity search and segmentation. aeon follows the scikit-learn API as much as possible to help new users and enable easy integration of aeon estimators with useful tools such as model selection and pipelines. It provides a broad library of time series algorithms, including efficient implementations of the very latest advances in research. Using a system of optional dependencies, aeon integrates a wide variety of packages into a single interface while keeping the core framework with minimal dependencies. The package is distributed under the 3-Clause BSD license and is available at https://github.com/ aeon-toolkit/aeon. This version was submitted to the JMLR journal on 02 Nov 2023 for v0.5.0 of aeon. At the time of this preprint aeon has released v0.9.0, and has had substantial changes.
Abstract:Shapelet-based algorithms are widely used for time series classification because of their ease of interpretation, but they are currently outperformed, notably by methods using convolutional kernels, capable of reaching state-of-the-art performance while being highly scalable. We present a new formulation of time series shapelets including the notion of dilation, and a shapelet extraction method based on convolutional kernels, which is able to target the discriminant information identified by convolutional kernels. Experiments performed on 108 datasets show that our method improves on the state-of-the-art for shapelet algorithms, and we show that it can be used to interpret results from convolutional kernels.
Abstract:Time series classification (TSC) gained a lot of attention in the past decade and number of methods for representing and classifying time series have been proposed. Nowadays, methods based on convolutional networks and ensemble techniques represent the state of the art for time series classification. Techniques transforming time series to image or text also provide reliable ways to extract meaningful features or representations of time series. We compare the state-of-the-art representation and classification methods on a specific application, that is predictive maintenance from sequences of event logs. The contributions of this paper are twofold: introducing a new data set for predictive maintenance on automated teller machines (ATMs) log data and comparing the performance of different representation methods for predicting the occurrence of a breakdown. The problem is difficult since unlike the classic case of predictive maintenance via signals from sensors, we have sequences of discrete event logs occurring at any time and the lengths of the sequences, corresponding to life cycles, vary a lot.