Abstract:Image quality assessment (IQA) is an active research area in the field of image processing. Most prior works focus on visual quality of natural images captured by cameras. In this paper, we explore visual quality of scanned documents, focusing on raster image areas. Different from many existing works which aim to estimate a visual quality score, we propose a machine learning based classification method to determine whether the visual quality of a scanned raster image at a given resolution setting is acceptable. We conduct a psychophysical study to determine the acceptability at different image resolutions based on human subject ratings and use them as the ground truth to train our machine learning model. However, this dataset is unbalanced as most images were rated as visually acceptable. To address the data imbalance problem, we introduce several noise models to simulate the degradation of image quality during the scanning process. Our results show that by including augmented data in training, we can significantly improve the performance of the classifier to determine whether the visual quality of raster images in a scanned document is acceptable or not for a given resolution setting.