The last decade has witnessed a notable surge in deep learning applications for the analysis of electroencephalography (EEG) data, thanks to its demonstrated superiority over conventional statistical techniques. However, even deep learning models can underperform if trained with bad processed data. While preprocessing is essential to the analysis of EEG data, there is a need of research examining its precise impact on model performance. This causes uncertainty about whether and to what extent EEG data should be preprocessed in a deep learning scenario. This study aims at investigating the role of EEG preprocessing in deep learning applications, drafting guidelines for future research. It evaluates the impact of different levels of preprocessing, from raw and minimally filtered data to complex pipelines with automated artifact removal algorithms. Six classification tasks (eye blinking, motor imagery, Parkinson's and Alzheimer's disease, sleep deprivation, and first episode psychosis) and four different architectures commonly used in the EEG domain were considered for the evaluation. The analysis of 4800 different trainings revealed statistical differences between the preprocessing pipelines at the intra-task level, for each of the investigated models, and at the inter-task level, for the largest one. Raw data generally leads to underperforming models, always ranking last in averaged score. In addition, models seem to benefit more from minimal pipelines without artifact handling methods, suggesting that EEG artifacts may contribute to the performance of deep neural networks.