We present a general system identification procedure capable of estimating of a broad spectrum of state-space dynamical models, including linear time-invariant (LTI), linear parameter-varying} (LPV), and nonlinear (NL) dynamics, along with rather general classes of noise models. Similar to the LTI case, we show that for this general class of model structures, including the NL case, the model dynamics can be separated into a deterministic process and a stochastic noise part, allowing to seamlessly tune the complexity of the combined model both in terms of nonlinearity and noise modeling. We parameterize the involved nonlinear functional relations by means of artificial neural-networks (ANNs), although alternative parametric nonlinear mappings can also be used. To estimate the resulting model structures, we optimize a prediction-error-based criterion using an efficient combination of a constrained quasi-Newton approach and automatic differentiation, achieving training times in the order of seconds compared to existing state-of-the-art ANN methods which may require hours for models of similar complexity. We formally establish the consistency guarantees for the proposed approach and demonstrate its superior estimation accuracy and computational efficiency on several benchmark LTI, LPV, and NL system identification problems.