The sparse identification of nonlinear dynamics (SINDy) is a regression framework for the discovery of parsimonious dynamic models and governing equations from time-series data. As with all system identification methods, noisy measurements compromise the accuracy and robustness of the model discovery procedure. In this work, we develop a variant of the SINDy algorithm that integrates automatic differentiation and recent time-stepping constrained motivated by Rudy et al. for simultaneously (i) denoising the data, (ii) learning and parametrizing the noise probability distribution, and (iii) identifying the underlying parsimonious dynamical system responsible for generating the time-series data. Thus within an integrated optimization framework, noise can be separated from signal, resulting in an architecture that is approximately twice as robust to noise as state-of-the-art methods, handling as much as 40% noise on a given time-series signal and explicitly parametrizing the noise probability distribution. We demonstrate this approach on several numerical examples, from Lotka-Volterra models to the spatio-temporal Lorenz 96 model. Further, we show the method can identify a diversity of probability distributions including Gaussian, uniform, Gamma, and Rayleigh.