Disentanglement, or identifying salient statistically independent factors of the data, is of interest in many areas of machine learning and statistics, with relevance to synthetic data generation with controlled properties, robust classification of features, parsimonious encoding, and a greater understanding of the generative process underlying the data. Disentanglement arises in several generative paradigms, including Variational Autoencoders (VAEs), Generative Adversarial Networks and diffusion models. Particular progress has recently been made in understanding disentanglement in VAEs, where the choice of diagonal posterior covariance matrices is shown to promote mutual orthogonality between columns of the decoder's Jacobian. We continue this thread to show how this linear independence translates to statistical independence, completing the chain in understanding how the VAE's objective identifies independent components of, or disentangles, the data.