Abstract:AASM guidelines are the results of decades of efforts aiming at standardizing sleep scoring procedure, in order to have a commonly used methodology. The guidelines cover several aspects from the technical/digital specifications, e.g., recommended EEG derivations, to detailed sleep scoring rules accordingly to age. In the context of sleep scoring automation, deep learning has demonstrated better performance compared to many other techniques. Usually, clinical expertise and official guidelines are fundamental to support automated sleep scoring algorithms in solving the task. In this paper we show that a deep learning based sleep scoring algorithm may not need to fully exploit the clinical knowledge or to strictly follow the AASM guidelines. Specifically, we demonstrate that U-Sleep, a state-of-the-art sleep scoring algorithm, can be strong enough to solve the scoring task even using clinically non-recommended or non-conventional derivations, and with no need to exploit information about the chronological age of the subjects. We finally strengthen a well-known finding that using data from multiple data centers always results in a better performing model compared with training on a single cohort. Indeed, we show that this latter statement is still valid even by increasing the size and the heterogeneity of the single data cohort. In all our experiments we used 28528 polysomnography studies from 13 different clinical studies.