Abstract:Causal machine learning (ML) promises to provide powerful tools for estimating individual treatment effects. Although causal ML methods are now well established, they still face the significant challenge of interpretability, which is crucial for medical applications. In this work, we propose a new algorithm based on the Conditional Permutation Importance (CPI) method for statistically rigorous variable importance assessment in the context of Conditional Average Treatment Effect (CATE) estimation. Our method termed PermuCATE is agnostic to both the meta-learner and the ML model used. Through theoretical analysis and empirical studies, we show that this approach provides a reliable measure of variable importance and exhibits lower variance compared to the standard Leave-One-Covariate-Out (LOCO) method. We illustrate how this property leads to increased statistical power, which is crucial for the application of explainable ML in small sample sizes or high-dimensional settings. We empirically demonstrate the benefits of our approach in various simulation scenarios, including previously proposed benchmarks as well as more complex settings with high-dimensional and correlated variables that require advanced CATE estimators.
Abstract:The use of deep learning for electroencephalography (EEG) classification tasks has been rapidly growing in the last years, yet its application has been limited by the relatively small size of EEG datasets. Data augmentation, which consists in artificially increasing the size of the dataset during training, has been a key ingredient to obtain state-of-the-art performances across applications such as computer vision or speech. While a few augmentation transformations for EEG data have been proposed in the literature, their positive impact on performance across tasks remains elusive. In this work, we propose a unified and exhaustive analysis of the main existing EEG augmentations, which are compared in a common experimental setting. Our results highlight the best data augmentations to consider for sleep stage classification and motor imagery brain computer interfaces, showing predictive power improvements greater than 10% in some cases.