Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jianxun Lou

MMP-2K: A Benchmark Multi-Labeled Macro Photography Image Quality Assessment Database

May 25, 2025

Jiashuo Chang, Zhengyi Li, Jianxun Lou, Zhen Qiu, Hanhe Lin

Abstract:Macro photography (MP) is a specialized field of photography that captures objects at an extremely close range, revealing tiny details. Although an accurate macro photography image quality assessment (MPIQA) metric can benefit macro photograph capturing, which is vital in some domains such as scientific research and medical applications, the lack of MPIQA data limits the development of MPIQA metrics. To address this limitation, we conducted a large-scale MPIQA study. Specifically, to ensure diversity both in content and quality, we sampled 2,000 MP images from 15,700 MP images, collected from three public image websites. For each MP image, 17 (out of 21 after outlier removal) quality ratings and a detailed quality report of distortion magnitudes, types, and positions are gathered by a lab study. The images, quality ratings, and quality reports form our novel multi-labeled MPIQA database, MMP-2k. Experimental results showed that the state-of-the-art generic IQA metrics underperform on MP images. The database and supplementary materials are available at https://github.com/Future-IQA/MMP-2k.

* Accepted to the IEEE International Conference on Image Processing, IEEE ICIP 2025

Via

Access Paper or Ask Questions

TranSalNet: Visual saliency prediction using transformers

Oct 07, 2021

Jianxun Lou, Hanhe Lin, David Marshall, Dietmar Saupe, Hantao Liu

Figure 1 for TranSalNet: Visual saliency prediction using transformers

Figure 2 for TranSalNet: Visual saliency prediction using transformers

Figure 3 for TranSalNet: Visual saliency prediction using transformers

Figure 4 for TranSalNet: Visual saliency prediction using transformers

Abstract:Convolutional neural networks (CNNs) have significantly advanced computational modeling for saliency prediction. However, the inherent inductive biases of convolutional architectures cause insufficient long-range contextual encoding capacity, which potentially makes a saliency model less humanlike. Transformers have shown great potential in encoding long-range information by leveraging the self-attention mechanism. In this paper, we propose a novel saliency model integrating transformer components to CNNs to capture the long-range contextual information. Experimental results show that the new components make improvements, and the proposed model achieves promising results in predicting saliency.

Via

Access Paper or Ask Questions