The ESCA Workshop on Speech Synthesis
September 25-28, 1990
In this paper, the design philosophies and performances of two components of our multi-language text-to-speech system are presented. A syntactic boundary neural network is trained with many five-word sequences and used to determine the boundaries existing before a middle word within a given word sequence. A letter-to-phoneme conversion neural network converts input letters to phonemes. To ensure reliability, we employed multiple networks and a unification layer. Results of performance evaluation for English show that the syntactic boundary neural network correctly located the syntactic boundaries with 96% accuracy (trained with 500 sentences, and tested with another 500 sentences), and that the letter-to-phoneme conversion neural network correctly converted letters to phonemes with 85% accuracy (trained with 1000 words, and tested with another 1000 words).
Bibliographic reference. Matsumoto, Tatsuro / Yamaguchi, Yukiko (1990): "A multi-language text-to-speech system using neural networks", In SSW1-1990, 269-272.