Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Spatio-Temporal Representation Learning Enhanced Source Cell-phone Recognition from Speech Recordings

Aug 25, 2022

Chunyan Zeng, Shixiong Feng, Zhifeng Wang, Xiangkui Wan, Yunfan Chen, Nan Zhao

Figure 1 for Spatio-Temporal Representation Learning Enhanced Source Cell-phone Recognition from Speech Recordings

Figure 2 for Spatio-Temporal Representation Learning Enhanced Source Cell-phone Recognition from Speech Recordings

Figure 3 for Spatio-Temporal Representation Learning Enhanced Source Cell-phone Recognition from Speech Recordings

Figure 4 for Spatio-Temporal Representation Learning Enhanced Source Cell-phone Recognition from Speech Recordings

Share this with someone who'll enjoy it:

Abstract:The existing source cell-phone recognition method lacks the long-term feature characterization of the source device, resulting in inaccurate representation of the source cell-phone related features which leads to insufficient recognition accuracy. In this paper, we propose a source cell-phone recognition method based on spatio-temporal representation learning, which includes two main parts: extraction of sequential Gaussian mean matrix features and construction of a recognition model based on spatio-temporal representation learning. In the feature extraction part, based on the analysis of time-series representation of recording source signals, we extract sequential Gaussian mean matrix with long-term and short-term representation ability by using the sensitivity of Gaussian mixture model to data distribution. In the model construction part, we design a structured spatio-temporal representation learning network C3D-BiLSTM to fully characterize the spatio-temporal information, combine 3D convolutional network and bidirectional long short-term memory network for short-term spectral information and long-time fluctuation information representation learning, and achieve accurate recognition of cell-phones by fusing spatio-temporal feature information of recording source signals. The method achieves an average accuracy of 99.03% for the closed-set recognition of 45 cell-phones under the CCNU\_Mobile dataset, and 98.18% in small sample size experiments, with recognition performance better than the existing state-of-the-art methods. The experimental results show that the method exhibits excellent recognition performance in multi-class cell-phones recognition.

* 29 pages, 4 figures

View paper on

Share this with someone who'll enjoy it:

Title:Spatio-Temporal Representation Learning Enhanced Source Cell-phone Recognition from Speech Recordings

Paper and Code