Abstract:This paper develops fast graph Fourier transform (GFT) algorithms with O(n log n) runtime complexity for rank-one updates of the path graph. We first show that several commonly-used audio and video coding transforms belong to this class of GFTs, which we denote by DCT+. Next, starting from an arbitrary generalized graph Laplacian and using rank-one perturbation theory, we provide a factorization for the GFT after perturbation. This factorization is our central result and reveals a progressive structure: we first apply the unperturbed Laplacian's GFT and then multiply the result by a Cauchy matrix. By specializing this decomposition to path graphs and exploiting the properties of Cauchy matrices, we show that Fast DCT+ algorithms exist. We also demonstrate that progressivity can speed up computations in applications involving multiple transforms related by rank-one perturbations (e.g., video coding) when combined with pruning strategies. Our results can be extended to other graphs and rank-k perturbations. Runtime analyses show that Fast DCT+ provides computational gains over the naive method for graph sizes larger than 64, with runtime approximately equal to that of 8 DCTs.
Abstract:Most codec designs rely on the mean squared error (MSE) as a fidelity metric in rate-distortion optimization, which allows to choose the optimal parameters in the transform domain but may fail to reflect perceptual quality. Alternative distortion metrics, such as the structural similarity index (SSIM), can be computed only pixel-wise, so they cannot be used directly for transform-domain bit allocation. Recently, the irregularity-aware graph Fourier transform (IAGFT) emerged as a means to include pixel-wise perceptual information in the transform design. This paper extends this idea by also learning a graph (and corresponding transform) for sets of blocks that share similar perceptual characteristics and are observed to differ statistically, leading to different learned graphs. We demonstrate the effectiveness of our method with both SSIM- and saliency-based criteria. We also propose a framework to derive separable transforms, including separable IAGFTs. An empirical evaluation based on the 5th CLIC dataset shows that our approach achieves improvements in terms of MS-SSIM with respect to existing methods.
Abstract:The photoresponse non-uniformity (PRNU) is a camera-specific pattern, widely adopted to solve multimedia forensics problems such as device identification or forgery detection. The theoretical analysis of this fingerprint customarily relies on a multiplicative model for the denoising residuals. This setup assumes that the nonlinear mapping from the scene irradiance to the preprocessed luminance, that is, the composition of the Camera Response Function (CRF) with the optical and digital preprocessing pipelines, is a gamma correction. Yet, this assumption seldom holds in practice. In this letter, we improve the multiplicative model by including the influence of this nonlinear mapping on the denoising residuals. We also propose a method to estimate this effect. Results evidence that the response of typical cameras deviates from a gamma correction. Experimental device identification with our model increases the TPR by a $4.93\, \%$ on average for a fixed FPR of $0.01$.