Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Hideki Kawahara

Proposal of protocols for speech materials acquisition and presentation assisted by tools based on structured test signals

Sep 30, 2024

Hideki Kawahara, Ken-Ichi Sakakibara, Mitsunori Mizumachi, Kohei Yatabe

Abstract:We propose protocols for acquiring speech materials, making them reusable for future investigations, and presenting them for subjective experiments. We also provide means to evaluate existing speech materials' compatibility with target applications. We built these protocols and tools based on structured test signals and analysis methods, including a new family of the Time-Stretched Pulse (TSP). Over a billion times more powerful computational (including software development) resources than a half-century ago enabled these protocols and tools to be accessible to under-resourced environments.

* 6 pages 6 figures, accepted ORIENTAL COCOSDA 2024

Via

Access Paper or Ask Questions

Interactive tools for making temporally variable, multiple-attributes, and multiple-instances morphing accessible: Flexible manipulation of divergent speech instances for explorational research and education

Apr 20, 2024

Hideki Kawahara, Masanori Morise

Abstract:We generalized a voice morphing algorithm capable of handling temporally variable, multiple-attributes, and multiple instances. The generalized morphing provides a new strategy for investigating speech diversity. However, excessive complexity and the difficulty of preparation have prevented researchers and students from enjoying its benefits. To address this issue, we introduced a set of interactive tools to make preparation and tests less cumbersome. These tools are integrated into our previously reported interactive tools as extensions. The introduction of the extended tools in lessons in graduate education was successful. Finally, we outline further extensions to explore excessively complex morphing parameter settings.

* 5 pages, 7 figures, submitted to Acoustical Science and Technology of Acoustical Society of Japan

Via

Access Paper or Ask Questions

Simultaneous Measurement of Multiple Acoustic Attributes Using Structured Periodic Test Signals Including Music and Other Sound Materials

Sep 06, 2023

Hideki Kawahara, Kohei Yatabe, Ken-Ichi Sakakibara, Mitsunori Mizumachi, Tatsuya Kitamura

Figure 1 for Simultaneous Measurement of Multiple Acoustic Attributes Using Structured Periodic Test Signals Including Music and Other Sound Materials

Figure 2 for Simultaneous Measurement of Multiple Acoustic Attributes Using Structured Periodic Test Signals Including Music and Other Sound Materials

Figure 3 for Simultaneous Measurement of Multiple Acoustic Attributes Using Structured Periodic Test Signals Including Music and Other Sound Materials

Figure 4 for Simultaneous Measurement of Multiple Acoustic Attributes Using Structured Periodic Test Signals Including Music and Other Sound Materials

Abstract:We introduce a general framework for measuring acoustic properties such as liner time-invariant (LTI) response, signal-dependent time-invariant (SDTI) component, and random and time-varying (RTV) component simultaneously using structured periodic test signals. The framework also enables music pieces and other sound materials as test signals by "safeguarding" them by adding slight deterministic "noise." Measurement using swept-sin, MLS (Maxim Length Sequence), and their variants are special cases of the proposed framework. We implemented interactive and real-time measuring tools based on this framework and made them open-source. Furthermore, we applied this framework to assess pitch extractors objectively.

* 8 pages, 17 figures, accepted for APSIPA ASC 2023

Via

Access Paper or Ask Questions

Measuring pitch extractors' response to frequency-modulated multi-component signals

Apr 02, 2022

Hideki Kawahara, Kohei Yatabe, Ken-Ichi Sakakibara, Tatsuya Kitamura, Hideki Banno, Masanori Morise

Figure 1 for Measuring pitch extractors' response to frequency-modulated multi-component signals

Figure 2 for Measuring pitch extractors' response to frequency-modulated multi-component signals

Figure 3 for Measuring pitch extractors' response to frequency-modulated multi-component signals

Figure 4 for Measuring pitch extractors' response to frequency-modulated multi-component signals

Abstract:This article focuses on the research tool for investigating the fundamental frequencies of voiced sounds. We introduce an objective and informative measurement method of pitch extractors' response to frequency-modulated tones. The method uses a new test signal for acoustic system analysis. The test signal enables simultaneous measurement of the extractors' responses. They are the modulation frequency response and the total distortion, including intermodulation distortions. We applied this method to various pitch extractors and placed them on several performance maps. We used the proposed method to fine-tune one of the extractors to make it the best fit tool for scientific research of voice fundamental frequencies.

* 11 pages, 9 figures, The following article has been submitted to/accepted by The Acoustical Society of America. After it is published, it will be found at http://asa.scitation.org/journal/jas

Via

Access Paper or Ask Questions

An objective test tool for pitch extractors' response attributes

Apr 02, 2022

Hideki Kawahara, Kohei Yatabe, Ken-Ichi Sakakibara, Tatsuya Kitamura, Hideki Banno, Masanori Morise

Figure 1 for An objective test tool for pitch extractors' response attributes

Figure 2 for An objective test tool for pitch extractors' response attributes

Figure 3 for An objective test tool for pitch extractors' response attributes

Figure 4 for An objective test tool for pitch extractors' response attributes

Abstract:We propose an objective measurement method for pitch extractors' responses to frequency-modulated signals. It enables us to evaluate different pitch extractors with unified criteria. The method uses extended time-stretched pulses combined by binary orthogonal sequences. It provides simultaneous measurement results consisting of the linear and the non-linear time-invariant responses and random and time-varying responses. We tested representative pitch extractors using fundamental frequencies spanning 80~Hz to 400~Hz with 1/48 octave steps and produced more than 1000 modulation frequency response plots. We found that making scientific visualization by animating these plots enables us to understand different pitch extractors' behavior at once. Such efficient and effortless inspection is impossible by inspecting all individual plots. The proposed measurement method with visualization leads to further improvement of the performance of one of the extractors mentioned above. In other words, our procedure turns the specific pitch extractor into the best reliable measuring equipment that is crucial for scientific research. We open-sourced MATLAB codes of the proposed objective measurement method and visualization procedure.

* 5 pages, 9 figures, submitted to Interspeech2022. arXiv admin note: text overlap with arXiv:2111.03629

Via

Access Paper or Ask Questions

Safeguarding test signals for acoustic measurement using arbitrary sounds

Dec 21, 2021

Hideki Kawahara, Kohei Yatabe

Figure 1 for Safeguarding test signals for acoustic measurement using arbitrary sounds

Figure 2 for Safeguarding test signals for acoustic measurement using arbitrary sounds

Figure 3 for Safeguarding test signals for acoustic measurement using arbitrary sounds

Figure 4 for Safeguarding test signals for acoustic measurement using arbitrary sounds

Abstract:We propose a simple method to measure acoustic responses using any sounds by converting them suitable for measurement. This method enables us to use music pieces for measuring acoustic conditions. It is advantageous to measure such conditions without annoying test sounds to listeners. In addition, applying the underlying idea of simultaneous measurement of multiple paths provides practically valuable features. For example, it is possible to measure deviations (temporally stable, random, and time-varying) and the impulse response while reproducing slightly modified contents under target conditions. The key idea of the proposed method is to add relatively small deterministic signals that sound like noise to the original sounds. We call the converted sounds safeguarded test signals.

* 4 pages, 10 figures, submitted to Acoustical Science and Technology

Via

Access Paper or Ask Questions

Objective measurement of pitch extractors' responses to frequency modulated sounds and two reference pitch extraction methods for analyzing voice pitch responses to auditory stimulation

Nov 05, 2021

Hideki Kawahara, Kohei Yatabe, Ken-Ichi Sakakibara, Tatsuya Kitamura, Hideki Banno, Masanori Morise

Figure 1 for Objective measurement of pitch extractors' responses to frequency modulated sounds and two reference pitch extraction methods for analyzing voice pitch responses to auditory stimulation

Figure 2 for Objective measurement of pitch extractors' responses to frequency modulated sounds and two reference pitch extraction methods for analyzing voice pitch responses to auditory stimulation

Figure 3 for Objective measurement of pitch extractors' responses to frequency modulated sounds and two reference pitch extraction methods for analyzing voice pitch responses to auditory stimulation

Figure 4 for Objective measurement of pitch extractors' responses to frequency modulated sounds and two reference pitch extraction methods for analyzing voice pitch responses to auditory stimulation

Abstract:We propose an objective measurement method for pitch extractors' responses to frequency-modulated signals. The method simultaneously measures the linear and the non-linear time-invariant responses and random and time-varying responses. It uses extended time-stretched pulses combined by binary orthogonal sequences. Our recent finding of involuntary voice pitch response to auditory stimulation while voicing motivated this proposal. The involuntary voice pitch response provides means to investigate voice chain subsystems individually and objectively. This response analysis requires reliable and precise pitch extraction. We found that existing pitch extractors failed to correctly analyze signals used for auditory stimulation by using the proposed method. Therefore, we propose two reference pitch extractors based on the instantaneous frequency analysis and multi-resolution power spectrum analysis. The proposed extractors correctly analyze the test signals. We open-sourced MATLAB codes to measure pitch extractors and codes for conducting the voice pitch response experiment on our GitHub repository.

* 5 pages, 5 figures, submitted ICASSP2022

Via

Access Paper or Ask Questions

Implementation of interactive tools for investigating fundamental frequency response of voiced sounds to auditory stimulation

Sep 23, 2021

Hideki Kawahara, Toshie Matsui Kohei, Yatabe Ken-Ichi Sakakibara Minoru Tsuzaki Masanori Morise Toshio Irino

Figure 1 for Implementation of interactive tools for investigating fundamental frequency response of voiced sounds to auditory stimulation

Figure 2 for Implementation of interactive tools for investigating fundamental frequency response of voiced sounds to auditory stimulation

Figure 3 for Implementation of interactive tools for investigating fundamental frequency response of voiced sounds to auditory stimulation

Figure 4 for Implementation of interactive tools for investigating fundamental frequency response of voiced sounds to auditory stimulation

Abstract:We introduced a measurement procedure for the involuntary response of voice fundamental-frequency to frequency modulated auditory stimulation. This involuntary response plays an essential role in voice fundamental frequency control while less investigated due to technical difficulties. This article introduces an interactive and real-time tool for investigating this response and supporting tools adopting our new measurement method. The method enables simultaneous measurement of multiple system properties based on a novel set of extended time-stretched pulses combined with orthogonalization. We made MATLAB implementation of these tools available as an open-source repository. This article also provides the detailed measurement procedure using the interactive tool followed by offline measurement tools for conducting subjective experiments and statistical analyses. It also provides technical descriptions of constituent signal processing subsystems as appendices. This application serves as an example for adopting our method to biological system analysis.

* Accepted for APSIPA ASC 2021

Via

Access Paper or Ask Questions

Mixture of orthogonal sequences made from extended time-stretched pulses enables measurement of involuntary voice fundamental frequency response to pitch perturbation

Apr 03, 2021

Hideki Kawahara, Toshie Matsui, Kohei Yatabe, Ken-Ichi Sakakibara, Minoru Tsuzaki, Masanori Morise, Toshio Irino

Figure 1 for Mixture of orthogonal sequences made from extended time-stretched pulses enables measurement of involuntary voice fundamental frequency response to pitch perturbation

Figure 2 for Mixture of orthogonal sequences made from extended time-stretched pulses enables measurement of involuntary voice fundamental frequency response to pitch perturbation

Figure 3 for Mixture of orthogonal sequences made from extended time-stretched pulses enables measurement of involuntary voice fundamental frequency response to pitch perturbation

Figure 4 for Mixture of orthogonal sequences made from extended time-stretched pulses enables measurement of involuntary voice fundamental frequency response to pitch perturbation

Abstract:Auditory feedback plays an essential role in the regulation of the fundamental frequency of voiced sounds. The fundamental frequency also responds to auditory stimulation other than the speaker's voice. We propose to use this response of the fundamental frequency of sustained vowels to frequency-modulated test signals for investigating involuntary control of voice pitch. This involuntary response is difficult to identify and isolate by the conventional paradigm, which uses step-shaped pitch perturbation. We recently developed a versatile measurement method using a mixture of orthogonal sequences made from a set of extended time-stretched pulses (TSP). In this article, we extended our approach and designed a set of test signals using the mixture to modulate the fundamental frequency of artificial signals. For testing the response, the experimenter presents the modulated signal aurally while the subject is voicing sustained vowels. We developed a tool for conducting this test quickly and interactively. We make the tool available as an open-source and also provide executable GUI-based applications. Preliminary tests revealed that the proposed method consistently provides compensatory responses with about 100 ms latency, representing involuntary control. Finally, we discuss future applications of the proposed method for objective and non-invasive auditory response measurements.

* 5 pages, 9 figures, submitted to Interspeech2021

Via

Access Paper or Ask Questions