Abstract:Deep models often struggle with out-of-distribution (OOD) generalization, limiting their real-world applicability beyond controlled laboratory settings. Invariant risk minimization (IRM) addresses this issue by learning invariant features and minimizing the risk across different domains. Thus, it avoids the pitfalls of pseudo-invariant features and spurious causality associated with empirical risk minimization (ERM). However, according to the support overlap theorem, ERM and IRM may fail to address the OOD problem when pseudo-invariant features have insufficient support overlap. To this end, we propose a novel method to enlarge feature support overlap for domain generalization. Specifically, we introduce Bayesian random semantic data augmentation to increase sample diversity and overcome the deficiency of IRM. Experiments on several challenging OOD generalization benchmarks demonstrate that our approach surpasses existing models, delivering superior performance and robustness. The code is available at \url{https://github.com/YaoyaoZhu19/BSDG}.
Abstract:Cone Beam CT (CBCT) plays a crucial role in Adaptive Radiation Therapy (ART) by accurately providing radiation treatment when organ anatomy changes occur. However, CBCT images suffer from scatter noise and artifacts, making relying solely on CBCT for precise dose calculation and accurate tissue localization challenging. Therefore, there is a need to improve CBCT image quality and Hounsfield Unit (HU) accuracy while preserving anatomical structures. To enhance the role and application value of CBCT in ART, we propose an energy-guided diffusion model (EGDiff) and conduct experiments on a chest tumor dataset to generate synthetic CT (sCT) from CBCT. The experimental results demonstrate impressive performance with an average absolute error of 26.87$\pm$6.14 HU, a structural similarity index measurement of 0.850$\pm$0.03, a peak signal-to-noise ratio of the sCT of 19.83$\pm$1.39 dB, and a normalized cross-correlation of the sCT of 0.874$\pm$0.04. These results indicate that our method outperforms state-of-the-art unsupervised synthesis methods in accuracy and visual quality, producing superior sCT images.
Abstract:Contrastive learning (CL) has shown great potential in image-to-image translation (I2I). Current CL-based I2I methods usually re-exploit the encoder of the generator to maximize the mutual information between the input and generated images, which does not exert an active effect on the decoder part. In addition, though negative samples play a crucial role in CL, most existing methods adopt a random sampling strategy, which may be less effective. In this paper, we rethink the CL paradigm in the unpaired I2I tasks from two perspectives and propose a new one-sided image translation framework called EnCo. First, we present an explicit constraint on the multi-scale pairwise features between the encoder and decoder of the generator to guarantee the semantic consistency of the input and generated images. Second, we propose a discriminative attention-guided negative sampling strategy to replace the random negative sampling, which significantly improves the performance of the generative model with an almost negligible computational overhead. Compared with existing methods, EnCo acts more effective and efficient. Extensive experiments on several popular I2I datasets demonstrate the effectiveness and advantages of our proposed approach, and we achieve several state-of-the-art compared to previous methods.