Interactive Voice Technology for Telecommunications Applications (IVTTA'98)

Torino, Italy
September 29-30, 1998

A Highly Intelligible Speech Synthesis for Banking Services in Financial Network System ANSER

Takao Koyama, Takashi Horie, Takashi Yoshioka, Fuminori Yoshitani, Jun-ichi Takahashi

NTT DATA CORPORATION, Laboratory for Information Technology, Kowa Kawasaki, Kawasaki-shi, Kanagawa, Japan

This paper describes the Japanese waveform-based speech synthesis that has been successfully added to the ANSER (Automatic answer Network System for Electronic Request) system , which is widely used for banking services in Japan. This method can produce highly intelligible speech comparable to natural voice. Its key features include a waveform dictionary containing specific waveforms for efficient pitch control, Japanese syllable unit-based waveform-CV, accurate accent control, and efficient waveform concatenation based on signal interpolation. A high intelligibility of 90% was attained (compared with 79% for the current LSP-CVC method) for 500 Japanese family names used in actual service.

Full Paper

Bibliographic reference.  Koyama, Takao / Horie, Takashi / Yoshioka, Takashi / Yoshitani, Fuminori / Takahashi, Jun-ichi (1998): "A highly intelligible speech synthesis for banking services in financial network system ANSER", In IVTTA'98, 87-90.