In this work, we propose the Sparse Multi-Family Deep Scattering Network (SMF-DSN), a novel architecture exploiting the interpretability of the Deep Scattering Network (DSN) and improving its expressive power. The DSN extracts salient and interpretable features in signals by cascading wavelet transforms, complex modulus and extract the representation of the data via a translation-invariant operator. First, leveraging the development of highly specialized wavelet filters over the last decades, we propose a multi-family approach to DSN. In particular, we propose to cross multiple wavelet transforms at each layer of the network, thus increasing the feature diversity and removing the need for an expert to select the appropriate filter. Secondly, we develop an optimal thresholding strategy adequate for the DSN that regularizes the network and controls possible instabilities induced by the signals, such as non-stationary noise. Our systematic and principled solution sparsifies the network's latent representation by acting as a local mask distinguishing between activity and noise. The SMF-DSN enhances the DSN by (i) increasing the diversity of the scattering coefficients and (ii) improves its robustness with respect to non-stationary noise.