In this paper, we address the well-known image quality assessment problem but in contrast from existing approaches that predict image quality independently for every images, we propose to jointly model different images depicting the same content to improve the precision of quality estimation. This proposal is motivated by the idea that multiple distorted images can provide information to disambiguate image features related to content and quality. To this aim, we combine the feature representations from the different images to estimate a pseudo-reference that we use to enhance score prediction. Our experiments show that at test-time, our method successfully combines the features from multiple images depicting the same new content, improving estimation quality.