ISCA Archive

International Symposium on Chinese Spoken Language Processing (ISCSLP 2000)

Fragrant Hill Hotel, Beijing
October 13-15, 2000

Session Oral 1


Context-Independent Chinese Initial-Final Acoustic Modeling

Authors: Jing LI, Fang ZHENG, Wenhu WU
Affiliation: Center of Speech Technology, State Key Laboratory of Intelligent Technology and Systems,
Department of Computer Science & Technology, Tsinghua University, Beijing
Mailto: lijing@sp.cs.tsinghua.edu.cn

ABSTRACT

In this paper, a method for the Context-Independent (CI) Chinese Initial-Final acoustic modeling for continuous speech recognition task is proposed. The initial-final (I/F) structure is a characteristic of Chinese language. Initials and finals are smaller units compared to syllables, the use of which is helpful to reduce the number of SRUs. Furthermore, it should be possible to build context-dependent (CD) models. In our experiments, we use knowledge-based criteria to define the CI initial-final units. There are four kinds of CI initial-final units in this paper. The experimental results show that the accuracy of the CI initial-final models is near to or lower than that of the CI syllable model, but the size of model is significantly reduced.

Page 23


Frequency Analysis of The Vowels in Cantonese

Authors: Eric ZEE
Affiliation: Phonetics Lab, Dept. of CTL, City University of HONG KONG, HONG KONG
Mailto: ctlzee@cityu.edu.hk

ABSTRACT

The study investigates the spectral characteristics of the vowels in Cantonese. Results show that (1)the vowels in the (C) V:S syllables undershoot in the formant frequencies relative to the canonical target formant pattern associated with the same vowels in the (C)V:syllables;(2)the center formant frequency values for the vowels in the (C)VS syllables are not representative of the quality of the vowels due to short vowel duration; and (3) the center formant frequencies for the vowels in the (C)V: and (C)V:S syllables can be useful in terms of vowel transcription.

Page 27


Annotation and Use of Speech Production Corpus for Building Language-Universal Speech Recognizers

Authors: Jiping SUN, Xing JING, Li DENG
Affiliation: Department of Electrical and Computer Engineering, University of Waterloo, Waterloo
Mailto: Jsun@crg3.uwaterloo.ca
xjing@crg3.uwaterloo.ca
deng@crg3.uwaterloo.ca

ABSTRACT

A corpus linguistic study is reported in this paper, guided by articulatory phonology and by general phonetic principles of speech production. A direct application of this study is the construction of Hidden Markov Model topologies for automatic speech recognition, taking into account integrated multilingualism with the consideration of the common physiological organs and processes involved in the production of speech sounds from the world’s languages. We demonstrate in this study that incorporation of speech production principles can provide effective constraints on pronunciation
modeling for the purpose of building language-universal speech recognizers.

Page 31


Rule-based Word Pronunciation Networks Generation For Mandarin Speech Recognition

Authors: Yi LIU, Pascale FUNG
Affiliation: Human Language Technology Center, Department of Eletrical and Electronic Engineering, University of Science and Technology, Hong Kong
Mailto: eelyx@ee.ust.hk
pascale@ee.ust.hk

ABSTRACT

Modeling pronunciation variation in spontaneous speech is very important for improving the recognition accuracy. One limitation of current recognition systems is their dictionaries for recognition only contain one standard pronunciation for each entry, so that the amount of variability that can be modeled is very limited. In this paper, we proposed to generate pronunciation networks based on rules to instead of traditional dictionary for
decoder. The networks consider the special structure of Chinese and incorporate acceptable variants of each Chinese syllable . Also, an automatically
learning algorithm is designed to get the variation rules. The proposed method was experimented on Hub4NE 1997 Mandarin Broadcast News Corpus
and HLTC stack decoder. The syllable recognition error rate was reduced 3.20% absolutely with both intra- and inter-syllable variations are both modeled.

Page 35


Prosodic Structure and Hierarchical Stress in Utterace of Standard Chinese --- One of Cues to Chinese Intonation

Authors: Maocan LIN, Jingzhu YAN
Affiliation: Phonetics Laboratory, Institute of Linguistics, CASS, Beijing
Mailto: Maocanlin@263.net

ABSTRACT

Prosodic word and its prominence and prosodic phrase are examined in this experiment. And the prominence in prosodic word is related to stress. It seems to us that the hierarchical stress in sentence spoken is one of intonational cues in Chinese. Tone and intonation in Chinese are two different phonological events in spoken sentence.

Page 39


Speech Corpus Collection and Annotation

Authors: Aijun LI, Xiaoxia CHEN, Guohua SUN, Wu HUA, Zhigang YIN, Yiqing ZU
Affiliation: Phonetic laboratory, Institute of Linguistics, Chinese Academy of Social Sciences
Mailto: Liaj@linguistics.cass.net.cn

ABSTRACT

This paper will particularly introduce a read and a spontaneous speech corpus to show how to collect and annotate the task dependent speech corpora. Additionally, segmental labeling convention SAMPA-C and prosodic labeling convention C-ToBI are depicted. Finally, known and new results are given or compared for these two annotated corpora.

Page 45