Abstract:We propose an adversarial attack for facial class-specific Single Image Super-Resolution (SISR) methods. Existing attacks, such as the Fast Gradient Sign Method (FGSM) or the Projected Gradient Descent (PGD) method, are either fast but ineffective, or effective but prohibitively slow on these networks. By closely inspecting the surface that the MSE loss, used to train such networks, traces under varying degradations, we were able to identify its parameterizable property. We leverage this property to propose an adverasrial attack that is able to locate the optimum degradation (effective) without needing multiple gradient-ascent steps (fast). Our experiments show that the proposed method is able to achieve a better speed vs effectiveness trade-off than the state-of-theart adversarial attacks, such as FGSM and PGD, for the task of unpaired facial as well as class-specific SISR.
Abstract:Real low-resolution (LR) face images contain degradations which are too varied and complex to be captured by known downsampling kernels and signal-independent noises. So, in order to successfully super-resolve real faces, a method needs to be robust to a wide range of noise, blur, compression artifacts etc. Some of the recent works attempt to model these degradations from a dataset of real images using a Generative Adversarial Network (GAN). They generate synthetically degraded LR images and use them with corresponding real high-resolution(HR) image to train a super-resolution (SR) network using a combination of a pixel-wise loss and an adversarial loss. In this paper, we propose a two module super-resolution network where the feature extractor module extracts robust features from the LR image, and the SR module generates an HR estimate using only these robust features. We train a degradation GAN to convert bicubically downsampled clean images to real degraded images, and interpolate between the obtained degraded LR image and its clean LR counterpart. This interpolated LR image is then used along with it's corresponding HR counterpart to train the super-resolution network from end to end. Entropy Regularized Wasserstein Divergence is used to force the encoded features learnt from the clean and degraded images to closely resemble those extracted from the interpolated image to ensure robustness.