8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003


Acoustic Model Selection and Voice Quality Assessment for HMM-Based Mandarin Speech Synthesis

Wentao Gu, Keikichi Hirose

University of Tokyo, Japan

This paper presents a preliminary study in implementing HMM-based Mandarin speech synthesis system, whose main advantage exists in generating various voices. A variety of acoustic unit representations for Mandarin are compared to select an optimal acoustic model set. Syllabic vs. sub-syllabic, context-independent vs. context-dependent, toneless vs. tonal, initial-final vs. preme-toneme models, and models with various numbers of states, are investigated respectively. To take the most advantage of HMM-based speech synthesis, some aspects affecting speaker adaptation quality, especially the selection of adaptation data size, are also studied.

Full Paper

Bibliographic reference.  Gu, Wentao / Hirose, Keikichi (2003): "Acoustic model selection and voice quality assessment for HMM-based Mandarin speech synthesis", In EUROSPEECH-2003, 2457-2460.