Abstract:In the field of single image super-resolution (SISR), transformer-based models, have demonstrated significant advancements. However, the potential and efficiency of these models in applied fields such as real-world image super-resolution are less noticed and there are substantial opportunities for improvement. Recently, composite fusion attention transformer (CFAT), outperformed previous state-of-the-art (SOTA) models in classic image super-resolution. This paper extends the CFAT model to an improved GAN-based model called IG-CFAT to effectively exploit the performance of transformers in real-world image super-resolution. IG-CFAT incorporates a semantic-aware discriminator to reconstruct image details more accurately, significantly improving perceptual quality. Moreover, our model utilizes an adaptive degradation model to better simulate real-world degradations. Our methodology adds wavelet losses to conventional loss functions of GAN-based super-resolution models to reconstruct high-frequency details more efficiently. Empirical results demonstrate that IG-CFAT sets new benchmarks in real-world image super-resolution, outperforming SOTA models in both quantitative and qualitative metrics.
Abstract:In medical image analysis, low-resolution images negatively affect the performance of medical image interpretation and may cause misdiagnosis. Single image super-resolution (SISR) methods can improve the resolution and quality of medical images. Currently, Generative Adversarial Networks (GAN) based super-resolution models have shown very good performance. Real-Enhanced Super-Resolution Generative Adversarial Network (Real-ESRGAN) is one of the practical GAN-based models which is widely used in the field of general image super-resolution. One of the challenges in the field of medical image super-resolution is that, unlike natural images, medical images do not have high spatial resolution. To solve this problem, we can use transfer learning technique and fine-tune the model that has been trained on external datasets (often natural datasets). In our proposed approach, the pre-trained generator and discriminator networks of the Real-ESRGAN model are fine-tuned using medical image datasets. In this paper, we worked on chest X-ray and retinal images and used the STARE dataset of retinal images and Tuberculosis Chest X-rays (Shenzhen) dataset for fine-tuning. The proposed model produces more accurate and natural textures, and its outputs have better details and resolution compared to the original Real-ESRGAN outputs.
Abstract:Single image super-resolution (SISR) methods can enhance the resolution and quality of underwater images. Enhancing the resolution of underwater images leads to better performance of autonomous underwater vehicles. In this work, we fine-tune the Real-Enhanced Super-Resolution Generative Adversarial Network (Real-ESRGAN) model to increase the resolution of underwater images. In our proposed approach, the pre-trained generator and discriminator networks of the Real-ESRGAN model are fine-tuned using underwater image datasets. We used the USR-248 and UFO-120 datasets to fine-tune the Real-ESRGAN model. Our fine-tuned model produces images with better resolution and quality compared to the original model.