Abstract:Blind detection of the forged regions in digital images is an effective authentication means to counter the malicious use of local image editing techniques. Existing encoder-decoder forensic networks overlook the fact that detecting complex and subtle tampered regions typically requires more feedback information. In this paper, we propose a Progressive FeedbACk-enhanced Transformer (ProFact) network to achieve coarse-to-fine image forgery localization. Specifically, the coarse localization map generated by an initial branch network is adaptively fed back to the early transformer encoder layers for enhancing the representation of positive features while suppressing interference factors. The cascaded transformer network, combined with a contextual spatial pyramid module, is designed to refine discriminative forensic features for improving the forgery localization accuracy and reliability. Furthermore, we present an effective strategy to automatically generate large-scale forged image samples close to real-world forensic scenarios, especially in realistic and coherent processing. Leveraging on such samples, a progressive and cost-effective two-stage training protocol is applied to the ProFact network. The extensive experimental results on nine public forensic datasets show that our proposed localizer greatly outperforms the state-of-the-art on the generalization ability and robustness of image forgery localization. Code will be publicly available at https://github.com/multimediaFor/ProFact.
Abstract:At present, attention mechanism has been widely applied to the fields of deep learning models. Structural models that based on attention mechanism can not only record the relationships between features position, but also can measure the importance of different features based on their weights. By establishing dynamically weighted parameters for choosing relevant and irrelevant features, the key information can be strengthened, and the irrelevant information can be weakened. Therefore, the efficiency of deep learning algorithms can be significantly elevated and improved. Although transformers have been performed very well in many fields including reinforcement learning, there are still many problems and applications can be solved and made with transformers within this area. MARL (known as Multi-Agent Reinforcement Learning) can be recognized as a set of independent agents trying to adapt and learn through their way to reach the goal. In order to emphasize the relationship between each MDP decision in a certain time period, we applied the hierarchical coding method and validated the effectiveness of this method. This paper proposed a hierarchical transformers MADDPG based on RNN which we call it Hierarchical RNNs-Based Transformers MADDPG(HRTMADDPG). It consists of a lower level encoder based on RNNs that encodes multiple step sizes in each time sequence, and it also consists of an upper sequence level encoder based on transformer for learning the correlations between multiple sequences so that we can capture the causal relationship between sub-time sequences and make HRTMADDPG more efficient.
Abstract:Resampling detection plays an important role in identifying image tampering, such as image splicing. Currently, the resampling detection is still difficult in recompressed images, which are yielded by applying resampling and post-JPEG compression to primary JPEG images. Although low quality primary compression benefits the detection, it remains rather challenging due to the widespread use of middle/high quality compression in imaging devices. In this paper, we propose a novel deep learning approach to learn resampling features directly from the recompressed images. To this end, a noise extraction layer based on low-order high pass filters is deployed to yield the image noise residual domain, which is more beneficial to extract manipulation trail features. A dual-stream convolutional neural network (CNN) is presented to capture the resampling trails along different directions, where the horizontal and vertical streams are interleaved and concatenated. Lastly, the learned features are fed into Sigmoid/Softmax layer, which is used as a binary/multiple classifier for achieving the blind detection or parameter estimation of resampling operations, respectively. Extensive experimental results demonstrate that our proposed method could detect resampling effectively in recompressed images and outperform the state-of-the-art detectors.
Abstract:In this paper, we propose a general framework to accelerate the universal histogram-based image contrast enhancement (CE) algorithms. Both spatial and gray-level selective down- sampling of digital images are adopted to decrease computational cost, while the visual quality of enhanced images is still preserved and without apparent degradation. Mapping function calibration is novelly proposed to reconstruct the pixel mapping on the gray levels missed by downsampling. As two case studies, accelerations of histogram equalization (HE) and the state-of-the-art global CE algorithm, i.e., spatial mutual information and PageRank (SMIRANK), are presented detailedly. Both quantitative and qualitative assessment results have verified the effectiveness of our proposed CE acceleration framework. In typical tests, computational efficiencies of HE and SMIRANK have been speeded up by about 3.9 and 13.5 times, respectively.
Abstract:As an efficient image contrast enhancement (CE) tool, adaptive gamma correction (AGC) was previously proposed by relating gamma parameter with cumulative distribution function (CDF) of the pixel gray levels within an image. ACG deals well with most dimmed images, but fails for globally bright images and the dimmed images with local bright regions. Such two categories of brightness-distorted images are universal in real scenarios, such as improper exposure and white object regions. In order to attenuate such deficiencies, here we propose an improved AGC algorithm. The novel strategy of negative images is used to realize CE of the bright images, and the gamma correction modulated by truncated CDF is employed to enhance the dimmed ones. As such, local over-enhancement and structure distortion can be alleviated. Both qualitative and quantitative experimental results show that our proposed method yields consistently good CE results.