Abstract:This paper presents a new filter method for unsupervised feature selection. This method is particularly effective on imbalanced multi-class dataset, as in case of clusters of different anomaly types. Existing methods usually involve the variance of the features, which is not suitable when the different types of observations are not represented equally. Our method, based on Spearman's Rank Correlation between distances on the observations and on feature values, avoids this drawback. The performance of the method is measured on several clustering problems and is compared with existing filter methods suitable for unsupervised data.
Abstract:In this paper we describe an approach for anomaly detection and its explainability in multivariate functional data. The anomaly detection procedure consists of transforming the series into a vector of features and using an Isolation forest algorithm. The explainable procedure is based on the computation of the SHAP coefficients and on the use of a supervised decision tree. We apply it on simulated data to measure the performance of our method and on real data coming from industry.