Abstract:The characterisation of time-series data via their most salient features is extremely important in a range of machine learning task, not least of all with regards to classification and clustering. While there exist many feature extraction techniques suitable for non-intermittent time-series data, these approaches are not always appropriate for intermittent time-series data, where intermittency is characterized by constant values for large periods of time punctuated by sharp and transient increases or decreases in value. Motivated by this, we present aggregation, mode decomposition and projection (AMP) a feature extraction technique particularly suited to intermittent time-series data which contain time-frequency patterns. For our method all individual time-series within a set are combined to form a non-intermittent aggregate. This is decomposed into a set of components which represent the intrinsic time-frequency signals within the data set. Individual time-series can then be fit to these components to obtain a set of numerical features that represent their intrinsic time-frequency patterns. To demonstrate the effectiveness of AMP, we evaluate against the real word task of clustering intermittent time-series data. Using synthetically generated data we show that a clustering approach which uses the features derived from AMP significantly outperforms traditional clustering methods. Our technique is further exemplified on a real world data set where AMP can be used to discover groupings of individuals which correspond to real world sub-populations.