The success of enhanced sampling molecular simulations that accelerate along collective variables (CVs) is predicated on the availability of variables coincident with the slow collective motions governing the long-time conformational dynamics of a system. It is challenging to intuit these slow CVs for all but the simplest molecular systems, and their data-driven discovery directly from molecular simulation trajectories has been a central focus of the molecular simulation community to both unveil the important physical mechanisms and to drive enhanced sampling. In this work, we introduce hierarchical dynamics encoder (HDE) as a deep learning architecture that learns nonlinear CV approximants to the leading slow eigenfunctions of the spectral decomposition of the transfer operator that evolves equilibrium-scaled probability distributions through time. Orthogonality of the learned CVs is naturally imposed within network training without added regularization. The CVs are inherently explicit and differentiable functions of the input coordinates making them well-suited to use in enhanced sampling calculations. We demonstrate the utility of HDEs in capturing parsimonious nonlinear representations of complex system dynamics in applications to 1D and 2D toy systems where the true eigenfunctions are exactly calculable and to molecular dynamics simulations of alanine dipeptide and the WW domain protein.