Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Grounding-IQA: Multimodal Language Grounding Model for Image Quality Assessment

Nov 26, 2024

Zheng Chen, Xun Zhang, Wenbo Li, Renjing Pei, Fenglong Song, Xiongkuo Min, Xiaohong Liu, Xin Yuan, Yong Guo, Yulun Zhang

Figure 1 for Grounding-IQA: Multimodal Language Grounding Model for Image Quality Assessment

Figure 2 for Grounding-IQA: Multimodal Language Grounding Model for Image Quality Assessment

Figure 3 for Grounding-IQA: Multimodal Language Grounding Model for Image Quality Assessment

Figure 4 for Grounding-IQA: Multimodal Language Grounding Model for Image Quality Assessment

Share this with someone who'll enjoy it:

Abstract:The development of multimodal large language models (MLLMs) enables the evaluation of image quality through natural language descriptions. This advancement allows for more detailed assessments. However, these MLLM-based IQA methods primarily rely on general contextual descriptions, sometimes limiting fine-grained quality assessment. To address this limitation, we introduce a new image quality assessment (IQA) task paradigm, grounding-IQA. This paradigm integrates multimodal referring and grounding with IQA to realize more fine-grained quality perception. Specifically, grounding-IQA comprises two subtasks: grounding-IQA-description (GIQA-DES) and visual question answering (GIQA-VQA). GIQA-DES involves detailed descriptions with precise locations (e.g., bounding boxes), while GIQA-VQA focuses on quality QA for local regions. To realize grounding-IQA, we construct a corresponding dataset, GIQA-160K, through our proposed automated annotation pipeline. Furthermore, we develop a well-designed benchmark, GIQA-Bench. The benchmark comprehensively evaluates the model grounding-IQA performance from three perspectives: description quality, VQA accuracy, and grounding precision. Experiments demonstrate that our proposed task paradigm, dataset, and benchmark facilitate the more fine-grained IQA application. Code: https://github.com/zhengchen1999/Grounding-IQA.

* Code is available at: https://github.com/zhengchen1999/Grounding-IQA

View paper on

Share this with someone who'll enjoy it:

Title:Grounding-IQA: Multimodal Language Grounding Model for Image Quality Assessment

Paper and Code