Image compression has been applied in the fields of image storage and video broadcasting. However, it's formidably tough to distinguish the subtle quality differences between those distorted images generated by different algorithms. In this paper, we propose a new image quality assessment framework to decide which image is better in an image group. To capture the subtle differences, a fine-grained network is adopted to acquire multi-scale features. Subsequently, we design a cross subtract block for separating and gathering the information within positive and negative image pairs. Enabling image comparison in feature space. After that, a progressive feature fusion block is designed, which fuses multi-scale features in a novel progressive way. Hierarchical spatial 2D features can thus be processed gradually. Experimental results show that compared with the current mainstream image quality assessment methods, the proposed network can achieve more accurate image quality assessment and ranks second in the benchmark of CLIC in the image perceptual model track.