Automated model selection is an important application in science and engineering. In this work, we develop a learning approach for identifying structured dynamical systems from undersampled and noisy spatiotemporal data. The learning is performed by a sparse least-squares fitting over a large set of candidate functions via a nonconvex $\ell_1-\ell_2$ sparse optimization solved by the alternating direction method of multipliers. Using a Bernstein-like inequality with a coherence condition, we show that if the set of candidate functions forms a structured random sampling matrix of a bounded orthogonal system, the recovery is stable and the error is bounded. The learning approach is validated on synthetic data generated by the viscous Burgers' equation and two reaction-diffusion equations. The computational results demonstrate the theoretical guarantees of success and the efficiency with respect to the ambient dimension and the number of candidate functions.