Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Towards Theoretical Understandings of Self-Consuming Generative Models

Feb 19, 2024

Shi Fu, Sen Zhang, Yingjie Wang, Xinmei Tian, Dacheng Tao

Figure 1 for Towards Theoretical Understandings of Self-Consuming Generative Models

Share this with someone who'll enjoy it:

Abstract:This paper tackles the emerging challenge of training generative models within a self-consuming loop, wherein successive generations of models are recursively trained on mixtures of real and synthetic data from previous generations. We construct a theoretical framework to rigorously evaluate how this training regimen impacts the data distributions learned by future models. Specifically, we derive bounds on the total variation (TV) distance between the synthetic data distributions produced by future models and the original real data distribution under various mixed training scenarios. Our analysis demonstrates that this distance can be effectively controlled under the condition that mixed training dataset sizes or proportions of real data are large enough. Interestingly, we further unveil a phase transition induced by expanding synthetic data amounts, proving theoretically that while the TV distance exhibits an initial ascent, it declines beyond a threshold point. Finally, we specialize our general results to diffusion models, delivering nuanced insights such as the efficacy of optimal early stopping within the self-consuming loop.

View paper on

Share this with someone who'll enjoy it:

Title:Towards Theoretical Understandings of Self-Consuming Generative Models

Paper and Code