Although considerable effort has been dedicated to improving the solution to the hyperspectral unmixing problem, non-idealities such as complex radiation scattering and endmember variability negatively impact the performance of most existing algorithms and can be very challenging to address. Recently, deep learning-based frameworks have been explored for hyperspectral umixing due to their flexibility and powerful representation capabilities. However, such techniques either do not address the non-idealities of the unmixing problem, or rely on black-box models which are not interpretable. In this paper, we propose a new interpretable deep learning method for hyperspectral unmixing that accounts for nonlinearity and endmember variability. The proposed method leverages a probabilistic variational deep-learning framework, where disentanglement learning is employed to properly separate the abundances and endmembers. The model is learned end-to-end using stochastic backpropagation, and trained using a self-supervised strategy which leverages benefits from semi-supervised learning techniques. Furthermore, the model is carefully designed to provide a high degree of interpretability. This includes modeling the abundances as a Dirichlet distribution, the endmembers using low-dimensional deep latent variable representations, and using two-stream neural networks composed of additive piecewise-linear/nonlinear components. Experimental results on synthetic and real datasets illustrate the performance of the proposed method compared to state-of-the-art algorithms.