Abstract:This paper presents a novel theoretical framework for understanding how diffusion models can learn disentangled representations. Within this framework, we establish identifiability conditions for general disentangled latent variable models, analyze training dynamics, and derive sample complexity bounds for disentangled latent subspace models. To validate our theory, we conduct disentanglement experiments across diverse tasks and modalities, including subspace recovery in latent subspace Gaussian mixture models, image colorization, image denoising, and voice conversion for speech classification. Additionally, our experiments show that training strategies inspired by our theory, such as style guidance regularization, consistently enhance disentanglement performance.
Abstract:Recent years have seen growing interest in exploiting dual- and multi-energy measurements in computed tomography (CT) in order to characterize material properties as well as object shape. Material characterization is performed by decomposing the scene into constitutive basis functions, such as Compton scatter and photoelectric absorption functions. While well motivated physically, the joint recovery of the spatial distribution of photoelectric and Compton properties is severely complicated by the fact that the data are several orders of magnitude more sensitive to Compton scatter coefficients than to photoelectric absorption, so small errors in Compton estimates can create large artifacts in the photoelectric estimate. To address these issues, we propose a model-based iterative approach which uses patch-based regularization terms to stabilize inversion of photoelectric coefficients, and solve the resulting problem though use of computationally attractive Alternating Direction Method of Multipliers (ADMM) solution techniques. Using simulations and experimental data acquired on a commercial scanner, we demonstrate that the proposed processing can lead to more stable material property estimates which should aid materials characterization in future dual- and multi-energy CT systems.