Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ira Gerson

Text-To-Speech Conversion with Neural Networks: A Recurrent TDNN Approach

Nov 24, 1998

Orhan Karaali, Gerald Corrigan, Ira Gerson, Noel Massey

Figure 1 for Text-To-Speech Conversion with Neural Networks: A Recurrent TDNN Approach

Figure 2 for Text-To-Speech Conversion with Neural Networks: A Recurrent TDNN Approach

Figure 3 for Text-To-Speech Conversion with Neural Networks: A Recurrent TDNN Approach

Figure 4 for Text-To-Speech Conversion with Neural Networks: A Recurrent TDNN Approach

Abstract:This paper describes the design of a neural network that performs the phonetic-to-acoustic mapping in a speech synthesis system. The use of a time-domain neural network architecture limits discontinuities that occur at phone boundaries. Recurrent data input also helps smooth the output parameter tracks. Independent testing has demonstrated that the voice quality produced by this system compares favorably with speech from existing commercial text-to-speech systems.

* Proceedings of Eurospeech (1997) 561-564. Rhodes, Greece
* 4 pages, PostScript

Via

Access Paper or Ask Questions

Speech Synthesis with Neural Networks

Nov 24, 1998

Orhan Karaali, Gerald Corrigan, Ira Gerson

Figure 1 for Speech Synthesis with Neural Networks

Figure 2 for Speech Synthesis with Neural Networks

Figure 3 for Speech Synthesis with Neural Networks

Figure 4 for Speech Synthesis with Neural Networks

Abstract:Text-to-speech conversion has traditionally been performed either by concatenating short samples of speech or by using rule-based systems to convert a phonetic representation of speech into an acoustic representation, which is then converted into speech. This paper describes a system that uses a time-delay neural network (TDNN) to perform this phonetic-to-acoustic mapping, with another neural network to control the timing of the generated speech. The neural network system requires less memory than a concatenation system, and performed well in tests comparing it to commercial systems using other technologies.

* World Congress on Neural Networks (1996) 45-50. San Diego
* 6 pages, PostScript

Via

Access Paper or Ask Questions