Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:On Inductive Biases That Enable Generalization of Diffusion Transformers

Oct 28, 2024

Jie An, De Wang, Pengsheng Guo, Jiebo Luo, Alexander Schwing

Figure 1 for On Inductive Biases That Enable Generalization of Diffusion Transformers

Figure 2 for On Inductive Biases That Enable Generalization of Diffusion Transformers

Figure 3 for On Inductive Biases That Enable Generalization of Diffusion Transformers

Figure 4 for On Inductive Biases That Enable Generalization of Diffusion Transformers

Share this with someone who'll enjoy it:

Abstract:Recent work studying the generalization of diffusion models with UNet-based denoisers reveals inductive biases that can be expressed via geometry-adaptive harmonic bases. However, in practice, more recent denoising networks are often based on transformers, e.g., the diffusion transformer (DiT). This raises the question: do transformer-based denoising networks exhibit inductive biases that can also be expressed via geometry-adaptive harmonic bases? To our surprise, we find that this is not the case. This discrepancy motivates our search for the inductive bias that can lead to good generalization in DiT models. Investigating the pivotal attention modules of a DiT, we find that locality of attention maps are closely associated with generalization. To verify this finding, we modify the generalization of a DiT by restricting its attention windows. We inject local attention windows to a DiT and observe an improvement in generalization. Furthermore, we empirically find that both the placement and the effective attention size of these local attention windows are crucial factors. Experimental results on the CelebA, ImageNet, and LSUN datasets show that strengthening the inductive bias of a DiT can improve both generalization and generation quality when less training data is available. Source code will be released publicly upon paper publication. Project page: dit-generalization.github.io/.

* Project page: https://dit-generalization.github.io; Code repository: https://github.com/DiT-Generalization/DiT-Generalization

View paper on

Share this with someone who'll enjoy it:

Title:On Inductive Biases That Enable Generalization of Diffusion Transformers

Paper and Code