Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Hyewon Bae

On Exact Inversion of DPM-Solvers

Nov 30, 2023

Seongmin Hong, Kyeonghyun Lee, Suh Yoon Jeon, Hyewon Bae, Se Young Chun

Figure 1 for On Exact Inversion of DPM-Solvers

Figure 2 for On Exact Inversion of DPM-Solvers

Figure 3 for On Exact Inversion of DPM-Solvers

Figure 4 for On Exact Inversion of DPM-Solvers

Abstract:Diffusion probabilistic models (DPMs) are a key component in modern generative models. DPM-solvers have achieved reduced latency and enhanced quality significantly, but have posed challenges to find the exact inverse (i.e., finding the initial noise from the given image). Here we investigate the exact inversions for DPM-solvers and propose algorithms to perform them when samples are generated by the first-order as well as higher-order DPM-solvers. For each explicit denoising step in DPM-solvers, we formulated the inversions using implicit methods such as gradient descent or forward step method to ensure the robustness to large classifier-free guidance unlike the prior approach using fixed-point iteration. Experimental results demonstrated that our proposed exact inversion methods significantly reduced the error of both image and noise reconstructions, greatly enhanced the ability to distinguish invisible watermarks and well prevented unintended background changes consistently during image editing. Project page: \url{https://smhongok.github.io/inv-dpm.html}.

* 16 pages

Via

Access Paper or Ask Questions

Korean Tokenization for Beam Search Rescoring in Speech Recognition

Mar 28, 2022

Kyuhong Shim, Hyewon Bae, Wonyong Sung

Figure 1 for Korean Tokenization for Beam Search Rescoring in Speech Recognition

Figure 2 for Korean Tokenization for Beam Search Rescoring in Speech Recognition

Figure 3 for Korean Tokenization for Beam Search Rescoring in Speech Recognition

Figure 4 for Korean Tokenization for Beam Search Rescoring in Speech Recognition

Abstract:The performance of automatic speech recognition (ASR) models can be greatly improved by proper beam-search decoding with external language model (LM). There has been an increasing interest in Korean speech recognition, but not many studies have been focused on the decoding procedure. In this paper, we propose a Korean tokenization method for neural network-based LM used for Korean ASR. Although the common approach is to use the same tokenization method for external LM as the ASR model, we show that it may not be the best choice for Korean. We propose a new tokenization method that inserts a special token, SkipTC, when there is no trailing consonant in a Korean syllable. By utilizing the proposed SkipTC token, the input sequence for LM becomes very regularly patterned so that the LM can better learn the linguistic characteristics. Our experiments show that the proposed approach achieves a lower word error rate compared to the same LM model without SkipTC. In addition, we are the first to report the ASR performance for the recently introduced large-scale 7,600h Korean speech dataset.

* Submitted to INTERSPEECH 2022

Via

Access Paper or Ask Questions