Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Kanggeon Lee

StreamMultiDiffusion: Real-Time Interactive Generation with Region-Based Semantic Control

Apr 01, 2024

Jaerin Lee, Daniel Sungho Jung, Kanggeon Lee, Kyoung Mu Lee

Figure 1 for StreamMultiDiffusion: Real-Time Interactive Generation with Region-Based Semantic Control

Figure 2 for StreamMultiDiffusion: Real-Time Interactive Generation with Region-Based Semantic Control

Figure 3 for StreamMultiDiffusion: Real-Time Interactive Generation with Region-Based Semantic Control

Figure 4 for StreamMultiDiffusion: Real-Time Interactive Generation with Region-Based Semantic Control

Abstract:The enormous success of diffusion models in text-to-image synthesis has made them promising candidates for the next generation of end-user applications for image generation and editing. Previous works have focused on improving the usability of diffusion models by reducing the inference time or increasing user interactivity by allowing new, fine-grained controls such as region-based text prompts. However, we empirically find that integrating both branches of works is nontrivial, limiting the potential of diffusion models. To solve this incompatibility, we present StreamMultiDiffusion, the first real-time region-based text-to-image generation framework. By stabilizing fast inference techniques and restructuring the model into a newly proposed multi-prompt stream batch architecture, we achieve $\times 10$ faster panorama generation than existing solutions, and the generation speed of 1.57 FPS in region-based text-to-image synthesis on a single RTX 2080 Ti GPU. Our solution opens up a new paradigm for interactive image generation named semantic palette, where high-quality images are generated in real-time from given multiple hand-drawn regions, encoding prescribed semantic meanings (e.g., eagle, girl). Our code and demo application are available at https://github.com/ironjr/StreamMultiDiffusion.

* 29 pages, 16 figures. v2: typos corrected, references added. Project page: https://jaerinlee.com/research/StreamMultiDiffusion

Via

Access Paper or Ask Questions

MEIL-NeRF: Memory-Efficient Incremental Learning of Neural Radiance Fields

Dec 31, 2022

Jaeyoung Chung, Kanggeon Lee, Sungyong Baik, Kyoung Mu Lee

Abstract:Hinged on the representation power of neural networks, neural radiance fields (NeRF) have recently emerged as one of the promising and widely applicable methods for 3D object and scene representation. However, NeRF faces challenges in practical applications, such as large-scale scenes and edge devices with a limited amount of memory, where data needs to be processed sequentially. Under such incremental learning scenarios, neural networks are known to suffer catastrophic forgetting: easily forgetting previously seen data after training with new data. We observe that previous incremental learning algorithms are limited by either low performance or memory scalability issues. As such, we develop a Memory-Efficient Incremental Learning algorithm for NeRF (MEIL-NeRF). MEIL-NeRF takes inspiration from NeRF itself in that a neural network can serve as a memory that provides the pixel RGB values, given rays as queries. Upon the motivation, our framework learns which rays to query NeRF to extract previous pixel values. The extracted pixel values are then used to train NeRF in a self-distillation manner to prevent catastrophic forgetting. As a result, MEIL-NeRF demonstrates constant memory consumption and competitive performance.

* 18 pages. For the project page, see https://robot0321.github.io/meil-nerf/index.html

Via

Access Paper or Ask Questions