Abstract:Compared to the existing function-based models in deep generative modeling, the recently proposed diffusion models have achieved outstanding performance with a stochastic-process-based approach. But a long sampling time is required for this approach due to many timesteps for discretization. Schr\"odinger bridge (SB)-based models attempt to tackle this problem by training bidirectional stochastic processes between distributions. However, they still have a slow sampling speed compared to generative models such as generative adversarial networks. And due to the training of the bidirectional stochastic processes, they require a relatively long training time. Therefore, this study tried to reduce the number of timesteps and training time required and proposed regularization terms to the existing SB models to make the bidirectional stochastic processes consistent and stable with a reduced number of timesteps. Each regularization term was integrated into a single term to enable more efficient training in computation time and memory usage. Applying this regularized stochastic process to various generation tasks, the desired translations between different distributions were obtained, and accordingly, the possibility of generative modeling based on a stochastic process with faster sampling speed could be confirmed. The code is available at https://github.com/KiUngSong/RSB.
Abstract:Super-resolution suffers from an innate ill-posed problem that a single low-resolution (LR) image can be from multiple high-resolution (HR) images. Recent studies on the flow-based algorithm solve this ill-posedness by learning the super-resolution space and predicting diverse HR outputs. Unfortunately, the diversity of the super-resolution outputs is still unsatisfactory, and the outputs from the flow-based model usually suffer from undesired artifacts which causes low-quality outputs. In this paper, we propose FS-NCSR which produces diverse and high-quality super-resolution outputs using frequency separation and noise conditioning compared to the existing flow-based approaches. As the sharpness and high-quality detail of the image rely on its high-frequency information, FS-NCSR only estimates the high-frequency information of the high-resolution outputs without redundant low-frequency components. Through this, FS-NCSR significantly improves the diversity score without significant image quality degradation compared to the NCSR, the winner of the previous NTIRE 2021 challenge.