Interspeech'2005 - Eurospeech

Lisbon, Portugal
September 4-8, 2005

Analysis of Major Factors of Naturalness Degradation in Concatenative Synthesis

Toshio Hirai (1), Hisashi Kawai (2), Minoru Tsuzaki (3), Nobuyuki Nishizawa (1)

(1) ATR-SLT, Japan; (2) KDDI R&D Laboratories Inc., Japan; (3) Kyoto City University of Arts, Japan

To effectively improve a speech synthesis system, it is important to find and focus on improving the modules whose effect on naturalness degradation in synthesized speech are the largest. In this paper, we describe the design of a perception experiment to measure the effect of each module separately. Synthesized speech stimuli whose intermediate information is modified during a synthesis process are used in the experiment. A perception experiment in which a Japanese concatenative speech synthesis system was evaluated revealed that the text processing module and a part of the feature prediction module (for the fundamental frequency) of the system were the major factors in degrading naturalness.

Full Paper

Bibliographic reference.  Hirai, Toshio / Kawai, Hisashi / Tsuzaki, Minoru / Nishizawa, Nobuyuki (2005): "Analysis of major factors of naturalness degradation in concatenative synthesis", In INTERSPEECH-2005, 1925-1928.