EUROSPEECH 2003 - INTERSPEECH 2003
From the perspective of speech technology development for unlimited Mandarin Chinese TTS, two issues appear most impedimental: (1.) how to predict prosody from text, and (2.) how to achieve better naturalness for speech output. These impediments somewhat brought out the major pitfalls in related research, i.e., characteristics of Chinese connected speech and the overall rhythmic structure of speech flow. This paper discusses where the problems stem from and how some solutions could be found. We propose that for Mandarin, prosody research needs to include the following: (1.) characteristics of Mandarin connected speech that constitute the prosodic properties in speech flow, i.e., units and boundaries, (2.) scope and type of speech data collected, i.e., text other than isolated sentences, (3.) prosody in relation to speech planning, i.e., information other than lexical, syntactic and semantic, and (4.) an overall organization of prosody for speech flow, i.e., a framework that accommodate the above mentioned features.
Bibliographic reference. Tseng, Chiu-yu (2003): "Mandarin speech prosody: issues, pitfalls and directions", In EUROSPEECH-2003, 2341-2344.