September 22-25, 1997
This paper describes the processing of 2465 sentences (or utterences) which are collected by phonetical rules from a big corpus--recent years' newspaper, "People's Daily" and etc., as materials of speech recognition and speech synthesis database. In these sentences, both phonetic phenomena and sentence patterns are included. We first consider the phonetic distribution among syllables: inter-syllabic diphones, inter-syllabic triphones and final-initial structure. The syllabic balance ensures the intra-syllabic phenomena such as phonemes, initial/final and consonant/vowel. There are roughly 17 kinds of sentence patterns which appear in our sentence set. We have also created a set of phonetically balanced 2-4 syllable phrases which includes all of the tone structures.
Bibliographic reference. Zu, Yiqing (1997): "Sentence design for speech synthesis and speech recognition database by phonetic rules", In EUROSPEECH-1997, 743-746.