We used neural networks in ~3,000 sleep recordings from over 10 locations to automate sleep stage scoring, producing a probability distribution called an hypnodensity graph. Accuracy was validated in 70 subjects scored by six technicians (gold standard). Our best model performed better than any individual scorer, reaching an accuracy of 0.87 (and 0.95 when predictions are weighed by scorer agreement). It also scores sleep stages down to 5-second instead of the conventional 30-second scoring-epochs. Accuracy did not vary by sleep disorder except for narcolepsy, suggesting scoring difficulties by machine and/or humans. A narcolepsy biomarker was extracted and validated in 105 type-1 narcoleptics versus 331 controls producing a specificity of 0.96 and a sensitivity of 0.91. Similar performances were obtained against a high pretest probability sample of type-2 narcolepsy and idiopathic hypersomnia patients. Addition of HLA-DQB1*06:02 increased specificity to 0.99. Our method streamlines scoring and diagnoses narcolepsy accurately.