Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Mark Thomas

Detection of Glottal Closure Instants from Speech Signals: a Quantitative Review

Dec 28, 2019

Thomas Drugman, Mark Thomas, Jon Gudnason, Patrick Naylor, Thierry Dutoit

Figure 1 for Detection of Glottal Closure Instants from Speech Signals: a Quantitative Review

Figure 2 for Detection of Glottal Closure Instants from Speech Signals: a Quantitative Review

Figure 3 for Detection of Glottal Closure Instants from Speech Signals: a Quantitative Review

Figure 4 for Detection of Glottal Closure Instants from Speech Signals: a Quantitative Review

Abstract:The pseudo-periodicity of voiced speech can be exploited in several speech processing applications. This requires however that the precise locations of the Glottal Closure Instants (GCIs) are available. The focus of this paper is the evaluation of automatic methods for the detection of GCIs directly from the speech waveform. Five state-of-the-art GCI detection algorithms are compared using six different databases with contemporaneous electroglottographic recordings as ground truth, and containing many hours of speech by multiple speakers. The five techniques compared are the Hilbert Envelope-based detection (HE), the Zero Frequency Resonator-based method (ZFR), the Dynamic Programming Phase Slope Algorithm (DYPSA), the Speech Event Detection using the Residual Excitation And a Mean-based Signal (SEDREAMS) and the Yet Another GCI Algorithm (YAGA). The efficacy of these methods is first evaluated on clean speech, both in terms of reliabililty and accuracy. Their robustness to additive noise and to reverberation is also assessed. A further contribution of the paper is the evaluation of their performance on a concrete application of speech processing: the causal-anticausal decomposition of speech. It is shown that for clean speech, SEDREAMS and YAGA are the best performing techniques, both in terms of identification rate and accuracy. ZFR and SEDREAMS also show a superior robustness to additive noise and reverberation.

Via

Access Paper or Ask Questions

Marine Mammal Species Classification using Convolutional Neural Networks and a Novel Acoustic Representation

Jul 30, 2019

Mark Thomas, Bruce Martin, Katie Kowarski, Briand Gaudet, Stan Matwin

Figure 1 for Marine Mammal Species Classification using Convolutional Neural Networks and a Novel Acoustic Representation

Figure 2 for Marine Mammal Species Classification using Convolutional Neural Networks and a Novel Acoustic Representation

Figure 3 for Marine Mammal Species Classification using Convolutional Neural Networks and a Novel Acoustic Representation

Figure 4 for Marine Mammal Species Classification using Convolutional Neural Networks and a Novel Acoustic Representation

Abstract:Research into automated systems for detecting and classifying marine mammals in acoustic recordings is expanding internationally due to the necessity to analyze large collections of data for conservation purposes. In this work, we present a Convolutional Neural Network that is capable of classifying the vocalizations of three species of whales, non-biological sources of noise, and a fifth class pertaining to ambient noise. In this way, the classifier is capable of detecting the presence and absence of whale vocalizations in an acoustic recording. Through transfer learning, we show that the classifier is capable of learning high-level representations and can generalize to additional species. We also propose a novel representation of acoustic signals that builds upon the commonly used spectrogram representation by way of interpolating and stacking multiple spectrograms produced using different Short-time Fourier Transform (STFT) parameters. The proposed representation is particularly effective for the task of marine mammal species classification where the acoustic events we are attempting to classify are sensitive to the parameters of the STFT.

* 16 pages, To appear in ECML-PKDD 2019

Via

Access Paper or Ask Questions