7th International Conference on Spoken Language Processing
September 16-20, 2002
Recently, corpus-based text-to-speech (CB-TTS) has been actively studied through the world. Statistical training methods are generally ap- plied for prosodic rules in CB-TTS, and classification and regression tree (CART) is one of the mostly used methods. In this paper, we present an efficient CART training approach of z-score based phonetic normalization. The idea of ours comes from the fact that the most important three parameters of CART training for segmental prosody are phone and its right and left phones, especially in Korean language. Our approach reduces the number of CART terminal nodes effectively. The reduction ratios are approximately 14-94% for estimation of segmental duration and 45-70% for intensity estimation. Also, the experimental results show that phonetic normalization slightly lessens the estimation errors.
Bibliographic reference. Song, Hoeun / Kim, Jaein / Lee, Kyongrok / Kim, Jinyoung (2002): "Phonetic normalization using z-score in segmental prosody estimation for corpus-based TTS system", In ICSLP-2002, 2393-2396.