9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

Intonation Modeling of Mandarin Chinese Using a Superpositional Approach

Pablo Daniel Aguero (1), Antonio Bonafonte (2), Lu Yu (2), Juan Carlos Tulli (1)

(1) University of Mar del Plata, Argentina
(2) Universitat Politècnica de Catalunya, Spain

The intonation model is an important component in text-to-speech systems to obtain natural and expressive speech synthesis. In this paper we propose a superpositional model for Mandarin Chinese. The intonation model is composed of the syllable and the phrase component. The parameters of the model are estimated using JEMA, a training approach with many advantages related to robustness and precision. Parameter estimation and model training are combined into a loop to progressively refine both the parameterization and the model. The high correlation (0.82) between synthetic and original contours in the test data show the suitability of this approach for modeling Mandarin. Furthermore, the high scores got in subjective evaluation (MOS=4.06) confirm the objective results.

