Third International Conference on Spoken Language Processing (ICSLP 94)

Yokohama, Japan
September 18-22, 1994

Speech Editor Based on Enhanced User-System Interaction for High Quality Text-To-Speech Synthesis

Kazuo Hakoda, Tomohisa Hirokawa, Kenzo Itoh

NTT Human Interface Laboratories, Kanagawa, Japan

This paper describes a new speech editor based on enhanced user-system interaction that produces high quality synthesized speech by using an advanced text-to-speech synthesis method. A prototype system is constructed on a work station with the Open Window system. Features of the prototype are that the operator can correct the faults of the text-to-speech synthesis method and produce high quality synthesized speech from input Japanese text. System operation has been optimized by adopting a real-time synthesizer and a GUI design based on mouse operations. A system evaluation confirms that character level correction is very effective for improving synthesized speech quality. The proposed system can be used to provide voice messages for a conventional digital audio response unit at low cost.

