Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Haram Choi

RAMiT: Reciprocal Attention Mixing Transformer for Lightweight Image Restoration

May 22, 2023

Haram Choi, Cheolwoong Na, Jihyeon Oh, Seungjae Lee, Jinseop Kim, Subeen Choe, Jeongmin Lee, Taehoon Kim, Jihoon Yang

Figure 1 for RAMiT: Reciprocal Attention Mixing Transformer for Lightweight Image Restoration

Figure 2 for RAMiT: Reciprocal Attention Mixing Transformer for Lightweight Image Restoration

Figure 3 for RAMiT: Reciprocal Attention Mixing Transformer for Lightweight Image Restoration

Figure 4 for RAMiT: Reciprocal Attention Mixing Transformer for Lightweight Image Restoration

Abstract:Although many recent works have made advancements in the image restoration (IR) field, they often suffer from an excessive number of parameters. Another issue is that most Transformer-based IR methods focus only on either local or global features, leading to limited receptive fields or deficient parameter issues. To address these problems, we propose a lightweight IR network, Reciprocal Attention Mixing Transformer (RAMiT). It employs our proposed dimensional reciprocal attention mixing Transformer (D-RAMiT) blocks, which compute bi-dimensional (spatial and channel) self-attentions in parallel with different numbers of multi-heads. The bi-dimensional attentions help each other to complement their counterpart's drawbacks and are then mixed. Additionally, we introduce a hierarchical reciprocal attention mixing (H-RAMi) layer that compensates for pixel-level information losses and utilizes semantic information while maintaining an efficient hierarchical structure. Furthermore, we revisit and modify MobileNet V1 and V2 to attach efficient convolutions to our proposed components. The experimental results demonstrate that RAMiT achieves state-of-the-art performance on multiple lightweight IR tasks, including super-resolution, color denoising, grayscale denoising, low-light enhancement, and deraining. Codes will be available soon.

* Technical report. 9 pages for main contents + 14 pages for appendix + 6 pages for references

Via

Access Paper or Ask Questions

Exploration of Lightweight Single Image Denoising with Transformers and Truly Fair Training

Apr 04, 2023

Haram Choi, Cheolwoong Na, Jinseop Kim, Jihoon Yang

Figure 1 for Exploration of Lightweight Single Image Denoising with Transformers and Truly Fair Training

Figure 2 for Exploration of Lightweight Single Image Denoising with Transformers and Truly Fair Training

Figure 3 for Exploration of Lightweight Single Image Denoising with Transformers and Truly Fair Training

Figure 4 for Exploration of Lightweight Single Image Denoising with Transformers and Truly Fair Training

Abstract:As multimedia content often contains noise from intrinsic defects of digital devices, image denoising is an important step for high-level vision recognition tasks. Although several studies have developed the denoising field employing advanced Transformers, these networks are too momory-intensive for real-world applications. Additionally, there is a lack of research on lightweight denosing (LWDN) with Transformers. To handle this, this work provides seven comparative baseline Transformers for LWDN, serving as a foundation for future research. We also demonstrate the parts of randomly cropped patches significantly affect the denoising performances during training. While previous studies have overlooked this aspect, we aim to train our baseline Transformers in a truly fair manner. Furthermore, we conduct empirical analyses of various components to determine the key considerations for constructing LWDN Transformers. Codes are available at https://github.com/rami0205/LWDN.

* Technical report. Will be further revised. Codes are available at https://github.com/rami0205/LWDN

Via

Access Paper or Ask Questions

N-Gram in Swin Transformers for Efficient Lightweight Image Super-Resolution

Nov 21, 2022

Haram Choi, Jeongmin Lee, Jihoon Yang

Abstract:While some studies have proven that Swin Transformer (SwinT) with window self-attention (WSA) is suitable for single image super-resolution (SR), SwinT ignores the broad regions for reconstructing high-resolution images due to window and shift size. In addition, many deep learning SR methods suffer from intensive computations. To address these problems, we introduce the N-Gram context to the image domain for the first time in history. We define N-Gram as neighboring local windows in SwinT, which differs from text analysis that views N-Gram as consecutive characters or words. N-Grams interact with each other by sliding-WSA, expanding the regions seen to restore degraded pixels. Using the N-Gram context, we propose NGswin, an efficient SR network with SCDP bottleneck taking all outputs of the hierarchical encoder. Experimental results show that NGswin achieves competitive performance while keeping an efficient structure, compared with previous leading methods. Moreover, we also improve other SwinT-based SR methods with the N-Gram context, thereby building an enhanced model: SwinIR-NG. Our improved SwinIR-NG outperforms the current best lightweight SR approaches and establishes state-of-the-art results. Codes will be available soon.

* 8 pages (main content) + 14 pages (supplementary content)

Via

Access Paper or Ask Questions