7th International Conference on Spoken Language Processing
September 16-20, 2002
This paper presents a multi-tier algorithm to extract a sentence set from a large raw text corpus for synthesis of Mandarin speech, taking account of varied prosodic and spectral characteristics. The prosodic and spectral characteristics are statistically analyzed from the text corpus and transcribed as syllable-sized unit candidates in a multi-tier way. The unit candidates cover all of the syllables, typical phonetic and tone contexts for each syllable, and effects of the phrase construction and sentence intonation on the syllable. The algorithm seeks to maximize the coverage of the unit candidates involved in the extracted sentence set. Experiments were run on a 580k-sentence corpus including dialog and news text. A (9),479 sentence set was selected out. It covers 87.7% of the primary prosodic and spectral characteristics in statements and 61.0% of those in questions. Also, this paper discusses the raw text corpus selection.
Bibliographic reference. Ni, Jinfu / Kawai, Hisashi (2002): "Design of a Mandarin sentence set for corpus-based speech synthesis by use of a multi-tier algorithm taking account of the varied prosodic and spectral characteristics", In ICSLP-2002, 2361-2364.