Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Shixiong Feng

Audio Tampering Detection Based on Shallow and Deep Feature Representation Learning

Oct 19, 2022

Zhifeng Wang, Yao Yang, Chunyan Zeng, Shuai Kong, Shixiong Feng, Nan Zhao

Figure 1 for Audio Tampering Detection Based on Shallow and Deep Feature Representation Learning

Figure 2 for Audio Tampering Detection Based on Shallow and Deep Feature Representation Learning

Figure 3 for Audio Tampering Detection Based on Shallow and Deep Feature Representation Learning

Figure 4 for Audio Tampering Detection Based on Shallow and Deep Feature Representation Learning

Abstract:Digital audio tampering detection can be used to verify the authenticity of digital audio. However, most current methods use standard electronic network frequency (ENF) databases for visual comparison analysis of ENF continuity of digital audio or perform feature extraction for classification by machine learning methods. ENF databases are usually tricky to obtain, visual methods have weak feature representation, and machine learning methods have more information loss in features, resulting in low detection accuracy. This paper proposes a fusion method of shallow and deep features to fully use ENF information by exploiting the complementary nature of features at different levels to more accurately describe the changes in inconsistency produced by tampering operations to raw digital audio. The method achieves 97.03% accuracy on three classic databases: Carioca 1, Carioca 2, and New Spanish. In addition, we have achieved an accuracy of 88.31% on the newly constructed database GAUDI-DI. Experimental results show that the proposed method is superior to the state-of-the-art method.

* Audio tampering detection, 21 pages, 4 figures

Via

Access Paper or Ask Questions

Spatio-Temporal Representation Learning Enhanced Source Cell-phone Recognition from Speech Recordings

Aug 25, 2022

Chunyan Zeng, Shixiong Feng, Zhifeng Wang, Xiangkui Wan, Yunfan Chen, Nan Zhao

Figure 1 for Spatio-Temporal Representation Learning Enhanced Source Cell-phone Recognition from Speech Recordings

Figure 2 for Spatio-Temporal Representation Learning Enhanced Source Cell-phone Recognition from Speech Recordings

Figure 3 for Spatio-Temporal Representation Learning Enhanced Source Cell-phone Recognition from Speech Recordings

Figure 4 for Spatio-Temporal Representation Learning Enhanced Source Cell-phone Recognition from Speech Recordings

Abstract:The existing source cell-phone recognition method lacks the long-term feature characterization of the source device, resulting in inaccurate representation of the source cell-phone related features which leads to insufficient recognition accuracy. In this paper, we propose a source cell-phone recognition method based on spatio-temporal representation learning, which includes two main parts: extraction of sequential Gaussian mean matrix features and construction of a recognition model based on spatio-temporal representation learning. In the feature extraction part, based on the analysis of time-series representation of recording source signals, we extract sequential Gaussian mean matrix with long-term and short-term representation ability by using the sensitivity of Gaussian mixture model to data distribution. In the model construction part, we design a structured spatio-temporal representation learning network C3D-BiLSTM to fully characterize the spatio-temporal information, combine 3D convolutional network and bidirectional long short-term memory network for short-term spectral information and long-time fluctuation information representation learning, and achieve accurate recognition of cell-phones by fusing spatio-temporal feature information of recording source signals. The method achieves an average accuracy of 99.03% for the closed-set recognition of 45 cell-phones under the CCNU\_Mobile dataset, and 98.18% in small sample size experiments, with recognition performance better than the existing state-of-the-art methods. The experimental results show that the method exhibits excellent recognition performance in multi-class cell-phones recognition.

* 29 pages, 4 figures

Via

Access Paper or Ask Questions