Abstract:Over the last decade, time series motif discovery has emerged as a useful primitive for many downstream analytical tasks, including clustering, classification, rule discovery, segmentation, and summarization. In parallel, there has been an increased understanding that Dynamic Time Warping (DTW) is the best time series similarity measure in a host of settings. Surprisingly however, there has been virtually no work on using DTW to discover motifs. The most obvious explanation of this is the fact that both motif discovery and the use of DTW can be computationally challenging, and the current best mechanisms to address their lethargy are mutually incompatible. In this work, we present the first scalable exact method to discover time series motifs under DTW. Our method automatically performs the best trade-off between time-to-compute and tightness-of-lower-bounds for a novel hierarchy of lower bounds representation we introduce. We show that under realistic settings, our algorithm can admissibly prune up to 99.99% of the DTW computations.
Abstract:Time series classification is an important task in its own right, and it is often a precursor to further downstream analytics. To date, virtually all works in the literature have used either shape-based classification using a distance measure or feature-based classification after finding some suitable features for the domain. It seems to be underappreciated that in many datasets it is the case that some classes are best discriminated with features, while others are best discriminated with shape. Thus, making the shape vs. feature choice will condemn us to poor results, at least for some classes. In this work, we propose a new model for classifying time series that allows the use of both shape and feature-based measures, when warranted. Our algorithm automatically decides which approach is best for which class, and at query time chooses which classifier to trust the most. We evaluate our idea on real world datasets and demonstrate that our ideas produce statistically significant improvement in classification accuracy.