Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Hyun Hee Park

Restore Anything with Masks: Leveraging Mask Image Modeling for Blind All-in-One Image Restoration

Sep 28, 2024

Chu-Jie Qin, Rui-Qi Wu, Zikun Liu, Xin Lin, Chun-Le Guo, Hyun Hee Park, Chongyi Li

Figure 1 for Restore Anything with Masks: Leveraging Mask Image Modeling for Blind All-in-One Image Restoration

Figure 2 for Restore Anything with Masks: Leveraging Mask Image Modeling for Blind All-in-One Image Restoration

Figure 3 for Restore Anything with Masks: Leveraging Mask Image Modeling for Blind All-in-One Image Restoration

Figure 4 for Restore Anything with Masks: Leveraging Mask Image Modeling for Blind All-in-One Image Restoration

Abstract:All-in-one image restoration aims to handle multiple degradation types using one model. This paper proposes a simple pipeline for all-in-one blind image restoration to Restore Anything with Masks (RAM). We focus on the image content by utilizing Mask Image Modeling to extract intrinsic image information rather than distinguishing degradation types like other methods. Our pipeline consists of two stages: masked image pre-training and fine-tuning with mask attribute conductance. We design a straightforward masking pre-training approach specifically tailored for all-in-one image restoration. This approach enhances networks to prioritize the extraction of image content priors from various degradations, resulting in a more balanced performance across different restoration tasks and achieving stronger overall results. To bridge the gap of input integrity while preserving learned image priors as much as possible, we selectively fine-tuned a small portion of the layers. Specifically, the importance of each layer is ranked by the proposed Mask Attribute Conductance (MAC), and the layers with higher contributions are selected for finetuning. Extensive experiments demonstrate that our method achieves state-of-the-art performance. Our code and model will be released at \href{https://github.com/Dragonisss/RAM}{https://github.com/Dragonisss/RAM}.

* Accepted by ECCV 2024

Via

Access Paper or Ask Questions

Self-Rectifying Diffusion Sampling with Perturbed-Attention Guidance

Mar 26, 2024

Donghoon Ahn, Hyoungwon Cho, Jaewon Min, Wooseok Jang, Jungwoo Kim, SeonHwa Kim, Hyun Hee Park, Kyong Hwan Jin, Seungryong Kim

Figure 1 for Self-Rectifying Diffusion Sampling with Perturbed-Attention Guidance

Figure 2 for Self-Rectifying Diffusion Sampling with Perturbed-Attention Guidance

Figure 3 for Self-Rectifying Diffusion Sampling with Perturbed-Attention Guidance

Figure 4 for Self-Rectifying Diffusion Sampling with Perturbed-Attention Guidance

Abstract:Recent studies have demonstrated that diffusion models are capable of generating high-quality samples, but their quality heavily depends on sampling guidance techniques, such as classifier guidance (CG) and classifier-free guidance (CFG). These techniques are often not applicable in unconditional generation or in various downstream tasks such as image restoration. In this paper, we propose a novel sampling guidance, called Perturbed-Attention Guidance (PAG), which improves diffusion sample quality across both unconditional and conditional settings, achieving this without requiring additional training or the integration of external modules. PAG is designed to progressively enhance the structure of samples throughout the denoising process. It involves generating intermediate samples with degraded structure by substituting selected self-attention maps in diffusion U-Net with an identity matrix, by considering the self-attention mechanisms' ability to capture structural information, and guiding the denoising process away from these degraded samples. In both ADM and Stable Diffusion, PAG surprisingly improves sample quality in conditional and even unconditional scenarios. Moreover, PAG significantly improves the baseline performance in various downstream tasks where existing guidances such as CG or CFG cannot be fully utilized, including ControlNet with empty prompts and image restoration such as inpainting and deblurring.

* Project page is available at https://ku-cvlab.github.io/Perturbed-Attention-Guidance

Via

Access Paper or Ask Questions