Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Alexander Korotin

Sampling from Energy distributions with Target Concrete Score Identity

Oct 27, 2025

Sergei Kholkin, Francisco Vargas, Alexander Korotin

Abstract:We introduce the Target Concrete Score Identity Sampler (TCSIS), a method for sampling from unnormalized densities on discrete state spaces by learning the reverse dynamics of a Continuous-Time Markov Chain (CTMC). Our approach builds on a forward in time CTMC with a uniform noising kernel and relies on the proposed Target Concrete Score Identity, which relates the concrete score, the ratio of marginal probabilities of two states, to a ratio of expectations of Boltzmann factors under the forward uniform diffusion kernel. This formulation enables Monte Carlo estimation of the concrete score without requiring samples from the target distribution or computation of the partition function. We approximate the concrete score with a neural network and propose two algorithms: Self-Normalized TCSIS and Unbiased TCSIS. Finally, we demonstrate the effectiveness of TCSIS on problems from statistical physics.

Via

Access Paper or Ask Questions

Time-Correlated Video Bridge Matching

Oct 14, 2025

Viacheslav Vasilev, Arseny Ivanov, Nikita Gushchin, Maria Kovaleva, Alexander Korotin

Abstract:Diffusion models excel in noise-to-data generation tasks, providing a mapping from a Gaussian distribution to a more complex data distribution. However they struggle to model translations between complex distributions, limiting their effectiveness in data-to-data tasks. While Bridge Matching (BM) models address this by finding the translation between data distributions, their application to time-correlated data sequences remains unexplored. This is a critical limitation for video generation and manipulation tasks, where maintaining temporal coherence is particularly important. To address this gap, we propose Time-Correlated Video Bridge Matching (TCVBM), a framework that extends BM to time-correlated data sequences in the video domain. TCVBM explicitly models inter-sequence dependencies within the diffusion bridge, directly incorporating temporal correlations into the sampling process. We compare our approach to classical methods based on bridge matching and diffusion models for three video-related tasks: frame interpolation, image-to-video generation, and video super-resolution. TCVBM achieves superior performance across multiple quantitative metrics, demonstrating enhanced generation quality and reconstruction fidelity.

Via

Access Paper or Ask Questions

Overclocking Electrostatic Generative Models

Sep 26, 2025

Daniil Shlenskii, Alexander Korotin

Figure 1 for Overclocking Electrostatic Generative Models

Figure 2 for Overclocking Electrostatic Generative Models

Figure 3 for Overclocking Electrostatic Generative Models

Figure 4 for Overclocking Electrostatic Generative Models

Abstract:Electrostatic generative models such as PFGM++ have recently emerged as a powerful framework, achieving state-of-the-art performance in image synthesis. PFGM++ operates in an extended data space with auxiliary dimensionality $D$, recovering the diffusion model framework as $D\to\infty$, while yielding superior empirical results for finite $D$. Like diffusion models, PFGM++ relies on expensive ODE simulations to generate samples, making it computationally costly. To address this, we propose Inverse Poisson Flow Matching (IPFM), a novel distillation framework that accelerates electrostatic generative models across all values of $D$. Our IPFM reformulates distillation as an inverse problem: learning a generator whose induced electrostatic field matches that of the teacher. We derive a tractable training objective for this problem and show that, as $D \to \infty$, our IPFM closely recovers Score Identity Distillation (SiD), a recent method for distilling diffusion models. Empirically, our IPFM produces distilled generators that achieve near-teacher or even superior sample quality using only a few function evaluations. Moreover, we observe that distillation converges faster for finite $D$ than in the $D \to \infty$ (diffusion) limit, which is consistent with prior findings that finite-$D$ PFGM++ models exhibit more favorable optimization and sampling properties.

Via

Access Paper or Ask Questions

Universal Inverse Distillation for Matching Models with Real-Data Supervision (No GANs)

Sep 26, 2025

Nikita Kornilov, David Li, Tikhon Mavrin, Aleksei Leonov, Nikita Gushchin, Evgeny Burnaev, Iaroslav Koshelev, Alexander Korotin

Abstract:While achieving exceptional generative quality, modern diffusion, flow, and other matching models suffer from slow inference, as they require many steps of iterative generation. Recent distillation methods address this by training efficient one-step generators under the guidance of a pre-trained teacher model. However, these methods are often constrained to only one specific framework, e.g., only to diffusion or only to flow models. Furthermore, these methods are naturally data-free, and to benefit from the usage of real data, it is required to use an additional complex adversarial training with an extra discriminator model. In this paper, we present RealUID, a universal distillation framework for all matching models that seamlessly incorporates real data into the distillation procedure without GANs. Our RealUID approach offers a simple theoretical foundation that covers previous distillation methods for Flow Matching and Diffusion models, and is also extended to their modifications, such as Bridge Matching and Stochastic Interpolants.

Via

Access Paper or Ask Questions

Risk-Averse Reinforcement Learning with Itakura-Saito Loss

May 22, 2025

Igor Udovichenko, Olivier Croissant, Anita Toleutaeva, Evgeny Burnaev, Alexander Korotin

Abstract:Risk-averse reinforcement learning finds application in various high-stakes fields. Unlike classical reinforcement learning, which aims to maximize expected returns, risk-averse agents choose policies that minimize risk, occasionally sacrificing expected value. These preferences can be framed through utility theory. We focus on the specific case of the exponential utility function, where we can derive the Bellman equations and employ various reinforcement learning algorithms with few modifications. However, these methods suffer from numerical instability due to the need for exponent computation throughout the process. To address this, we introduce a numerically stable and mathematically sound loss function based on the Itakura-Saito divergence for learning state-value and action-value functions. We evaluate our proposed loss function against established alternatives, both theoretically and empirically. In the experimental section, we explore multiple financial scenarios, some with known analytical solutions, and show that our loss function outperforms the alternatives.

Via

Access Paper or Ask Questions

One-Step Residual Shifting Diffusion for Image Super-Resolution via Distillation

Mar 17, 2025

Daniil Selikhanovych, David Li, Aleksei Leonov, Nikita Gushchin, Sergei Kushneriuk, Alexander Filippov, Evgeny Burnaev, Iaroslav Koshelev, Alexander Korotin

Abstract:Diffusion models for super-resolution (SR) produce high-quality visual results but require expensive computational costs. Despite the development of several methods to accelerate diffusion-based SR models, some (e.g., SinSR) fail to produce realistic perceptual details, while others (e.g., OSEDiff) may hallucinate non-existent structures. To overcome these issues, we present RSD, a new distillation method for ResShift, one of the top diffusion-based SR models. Our method is based on training the student network to produce such images that a new fake ResShift model trained on them will coincide with the teacher model. RSD achieves single-step restoration and outperforms the teacher by a large margin. We show that our distillation method can surpass the other distillation-based method for ResShift - SinSR - making it on par with state-of-the-art diffusion-based SR distillation methods. Compared to SR methods based on pre-trained text-to-image models, RSD produces competitive perceptual quality, provides images with better alignment to degraded input images, and requires fewer parameters and GPU memory. We provide experimental results on various real-world and synthetic datasets, including RealSR, RealSet65, DRealSR, ImageNet, and DIV2K.

Via

Access Paper or Ask Questions

Field Matching: an Electrostatic Paradigm to Generate and Transfer Data

Feb 04, 2025

Alexander Kolesov, Manukhov Stepan, Vladimir V. Palyulin, Alexander Korotin

Abstract:We propose Electrostatic Field Matching (EFM), a novel method that is suitable for both generative modeling and distribution transfer tasks. Our approach is inspired by the physics of an electrical capacitor. We place source and target distributions on the capacitor plates and assign them positive and negative charges, respectively. We then learn the electrostatic field of the capacitor using a neural network approximator. To map the distributions to each other, we start at one plate of the capacitor and move the samples along the learned electrostatic field lines until they reach the other plate. We theoretically justify that this approach provably yields the distribution transfer. In practice, we demonstrate the performance of our EFM in toy and image data experiments.

Via

Access Paper or Ask Questions

Categorical Schrödinger Bridge Matching

Feb 03, 2025

Grigoriy Ksenofontov, Alexander Korotin

Figure 1 for Categorical Schrödinger Bridge Matching

Figure 2 for Categorical Schrödinger Bridge Matching

Figure 3 for Categorical Schrödinger Bridge Matching

Figure 4 for Categorical Schrödinger Bridge Matching

Abstract:The Schr\"odinger Bridge (SB) is a powerful framework for solving generative modeling tasks such as unpaired domain translation. Most SB-related research focuses on continuous data space $\mathbb{R}^{D}$ and leaves open theoretical and algorithmic questions about applying SB methods to discrete data, e.g, on finite spaces $\mathbb{S}^{D}$. Notable examples of such sets $\mathbb{S}$ are codebooks of vector-quantized (VQ) representations of modern autoencoders, tokens in texts, categories of atoms in molecules, etc. In this paper, we provide a theoretical and algorithmic foundation for solving SB in discrete spaces using the recently introduced Iterative Markovian Fitting (IMF) procedure. Specifically, we theoretically justify the convergence of discrete-time IMF (D-IMF) to SB in discrete spaces. This enables us to develop a practical computational algorithm for SB which we call Categorical Schr\"odinger Bridge Matching (CSBM). We show the performance of CSBM via a series of experiments with synthetic data and VQ representations of images.

Via

Access Paper or Ask Questions

InfoBridge: Mutual Information estimation via Bridge Matching

Feb 03, 2025

Sergei Kholkin, Ivan Butakov, Evgeny Burnaev, Nikita Gushchin, Alexander Korotin

Figure 1 for InfoBridge: Mutual Information estimation via Bridge Matching

Figure 2 for InfoBridge: Mutual Information estimation via Bridge Matching

Figure 3 for InfoBridge: Mutual Information estimation via Bridge Matching

Figure 4 for InfoBridge: Mutual Information estimation via Bridge Matching

Abstract:Diffusion bridge models have recently become a powerful tool in the field of generative modeling. In this work, we leverage their power to address another important problem in machine learning and information theory - the estimation of the mutual information (MI) between two random variables. We show that by using the theory of diffusion bridges, one can construct an unbiased estimator for data posing difficulties for conventional MI estimators. We showcase the performance of our estimator on a series of standard MI estimation benchmarks.

Via

Access Paper or Ask Questions

Inverse Bridge Matching Distillation

Feb 03, 2025

Nikita Gushchin, David Li, Daniil Selikhanovych, Evgeny Burnaev, Dmitry Baranchuk, Alexander Korotin

Abstract:Learning diffusion bridge models is easy; making them fast and practical is an art. Diffusion bridge models (DBMs) are a promising extension of diffusion models for applications in image-to-image translation. However, like many modern diffusion and flow models, DBMs suffer from the problem of slow inference. To address it, we propose a novel distillation technique based on the inverse bridge matching formulation and derive the tractable objective to solve it in practice. Unlike previously developed DBM distillation techniques, the proposed method can distill both conditional and unconditional types of DBMs, distill models in a one-step generator, and use only the corrupted images for training. We evaluate our approach for both conditional and unconditional types of bridge matching on a wide set of setups, including super-resolution, JPEG restoration, sketch-to-image, and other tasks, and show that our distillation technique allows us to accelerate the inference of DBMs from 4x to 100x and even provide better generation quality than used teacher model depending on particular setup.

Via

Access Paper or Ask Questions