ISCA Archive

International Symposium on Chinese Spoken Language Processing (ISCSLP 2000)

Fragrant Hill Hotel, Beijing
October 13-15, 2000

Session Poster B2

Improved Strategies For Intelligent Sentence Input Method Engine System

Authors: Ling JIN , Genqing WU , Fang ZHENG , Wenhu WU
Affiliation: Center of Speech Technology, State Key Laboratory of Intelligent Technology and Systems,
Department of Computer Science & Technology, Tsinghua University, Beijing


This paper describes a Chinese keyboard intelligent full-sentence input method system based on tri-gram language model. In this system, we use efficient algorithms to reduce the size of the language model, accelerate the search and enhance the accuracy. The n-gram model is presented in a novel structure, which shrinks the model and enables it with look-ahead and buffer techniques to reduce the times of visiting the disk to fetch the n-gram unit and to adapt the language model to user domain quickly. Besides that, we have designed an efficient dynamic programming algorithm to segment input alphabetic sequence into syllabic cells; thereby it can be fit for different input ways.

Page 247

An Enhanced RASTA Processing for Speaker Identification

Authors: Bin ZHEN, Xihong WU, Zhimin LIU, Huisheng CHI
Affiliation: Center for Information Science, Peking University, Beijing


In this paper, we propose an Enhanced RASTA (E_RASTA) technique for speaker identification. The new method consists of classical RASTA filtering in logarithmic spectrum domain following by another RASTA processing in spectrum domain. In this manner, both the channel distortion and additive noise are removed effectively. In isolated digit speaker identification experiment on TI46 database, we found that the E_RASTA performed equal or better than J_RASTA method. The new method does not need the estimation of speech SNR in order to determinate the optimal value of J and multi-templates in J_RASTA, and the information of how the speech degrades.

Page 251

Coarticulation and Application of Lateral in Standard Chinese in Speaker Identification

Authors: Cuiling ZHANG, Xiaoli LIU, Tiejun TAN, Jingxu CUI
Affiliation: Department of Criminal Technology, Chinese Criminal Police College , Shenyang


Lateral is one of the four voiced consonants in Standard Chinese and it often displays many variants in pronunciation because of different following vowels. It’s distribution of formant frequency changes greatly with different vowels and assumes strong coarticulation. It is suggested that the coarticulation is different from person to person. Whereas lateral has relative stability and value of formant frequency of the same speaker assumes relative stable state. Therefore, the individual features of the coarticulation may be anticipated in speaker’s sound. The aim of the article is to study coarticulation of lateral with different vowels, it’s behavior in different speakers and application in speaker identification.

Page 255

An Interlingua for Dialogue Translation

Authors: Hua WU, Taiyi HUANG, Bo XU
Affiliation: National Laboratory of Pattern Recognition, Institute of Automation Chinese Academy of Sciences, Beijing


An interchange format (IF) suitable for spoken language translation is introduced in this paper. It is a semantic representation of languages and used as a kind of interlingua among different languages. The most obvious characteristics of the semantic representation are its independence of peculiarities of any language and its underspecification. The whole semantic representation has up to four components: speaker tag, speech act, topic and arguments. The development of the interchange format is guided by the corpus of our hotel reservation domain. And the IF has been applied to two languages: Chinese and English. This paper will also discuss the role of the interchange format in our spoken language translation system.

Page 259

On Use of GMM for Multilingual Speaker Verification: An Empirical Study

Authors: Xi-Ke QING, Ke CHEN
Affiliation: National Laboratory of Machine Perception and
The Center for Information Science, Peking University, Beijing


This paper presents an empirical study on multilingual speaker verification based on a sophisticated statistical model ? Gaussian Mixture Model (GMM). The languages used include Mandarin, Cantonese, and English. Comparative results of speaker verification are presented in terms of different databases associated with different languages. Our simulation results indicate that GMM can be used as a unified model in multilingual speaker verification, which provides an easy-to-use way for building a multilingual speaker verification system.

Page 263

Comparison of Several Smoothing Methods in Statistical Language Model

Authors: Yang LIU, Jiasong SUN, Zuoying WANG
Affiliation: Department of Electronic Engineering, Tsinghua University, Beijing


With the development of computer technology and the appearance of huge training text corpus, the performance of language model has improved a lot recently. But its intrinsic sparse data problem still exists. This paper investigates several smoothing methods in the application of Chinese continuous speech recognition. We compare the performance of different methods, particularly in the situation of pruned language model and conclude that the Kneser-Ney strategy is better for the model without pruning while its performance decreases for the pruned language model.

Page 267

Statistical Approach to Chinese-English Spoken-language Translation in Hotel Reservation Domain

Authors: Wei CHENG, Bo XU
Affiliation: National Laboratory of Pattern Recognition,
Institute of Automation, Chinese Academy of Sciences, Beijing


This paper investigates a preliminary translation system from Chinese to English based on the statistical approach and tests its performance on a limited-domain spoken-language task: hotel reservation. A bilingual corpus is available for the task, which exhibits some typical phenomena of spontaneous speech. The experiments are performed on both the text transcription and the speech recognizer output. The word error rate is about 14%. Some analyses present a great potential for improving the translation quality. From the results and analyses, a broad prospect is showed on the statistical approach to the spoken-language translation.

Page 271

Unsupervised Word Induction Using MDL Criterion

Authors: Hua YU
Affiliation: Interactive Systems Lab, Carnegie Mellon University, Pittsburgh


Unsupervised learning of units (phonemes, words, phrases, etc.) is important to the design of statistical speech and NLP systems. This paper presents a general source-coding framework for induc-ing words from natural language text without word boundaries. An efficient search algorithm is developed to optimize the mini-mum description length (MDL) induction criterion. Despite some seemingly over-simplified modeling assumption, we achieved good results on several word induction problems.

Page 275

Block Analysis of Bilingual Corpus for Chinese-English Statistical Machine Translation

Authors: Hairong XIA, Bo XU, Taiyi HUANG
Affiliation: National Laboratory of Pattern Recognition
Institute of Automation, Chinese Academy of Sciences


In this paper, we describe a bilingual corpus processing strategy, block analysis, from a new point of view. By this analysis strategy, we want to extract more information from bilingual corpus for future statistical machine translation. At first, we define some block types and give some statistical data from a Chinese-English bilingual corpus under this framework. Then a block-based alignment algorithm is presented, by which we can extract and align the corresponding bilingual blocks automatically. Some experimental results show that block analysis is practical and more informative than any other word-based approach.

Page 279

Semi-class-based N-gram Language Modeling for Chinese Dictation

Authors: Min ZHANG, Engsiong CHNG, Haizhou LI
Affiliation: Lernout & Hauspie Asia Pacific, 29 International
Business Park, #08-05 Acer Tower B, Singapore


In this paper, we propose a novel semi-class-based n-gram language modeling. The proposed modeling estimates the n-gram probability from the observed frequencies of word-class n-tuples, constituted by the (n-1) classes of preceding (n-1) words of the utterance and the current word itself. Three kinds of language modeling, word-based, class-based and semi-class-based n-gram modeling are implemented to build bi-gram and tri-gram models for a vocabulary of 50k words over a corpus of over 200 millions Chinese words. The parameter numbers and LM perplexities among the three models have been studied and compared. Our experiments show that our proposal of using the semi-class language modeling is a good tradeoff between the number of parameters and LM perplexity.

Page 283

Lexicon Optimization for Chinese Language Modeling

Authors: Jun ZHAO, Jianfeng GAO, Eric CHANG, Mingjing LI
Affiliation: University of Science & Technology of China
Microsoft Research China


In this paper, we present an approach to lexicon optimization for Chinese language modeling. The method is an iterative procedure consisting of two phases, namely lexicon generation and lexicon pruning. In the first phase, we extract appropriate new words from a very large training corpus using statistical approaches. In the second phase, we prune the lexicon to a pre-set memory limitation using a perplexity minimization criterion. Experimental results show up to a 6% character perplexity reduction compared to the baseline lexicon.

Page 287

Rule-based Post-Processing of Pinyin To Chinese Characters Conversion System

Authors: Yan ZHANG, Bo XU, Chengqing ZONG
Affiliation: National Laboratory of Pattern Recognition,
Institute of Automation, Chinese Academy of Sciences, Beijing


Statistical method is a good way for pinyin to Chinese characters conversion and has gotten preferable conversion rate. However, there are still several percent words cannot be converted correctly with the method. This paper presents an error correction approach based on grammatical and semantic rules. According to the conversion results and neighboring information obtained from pinyin to Chinese characters using statisticalmethod, we build a knowledge base consists of phrase rules, syntactic rules and semantic rules. By analyzing the syntactic structure of sentences, we check the semantic correction at some local part of speech node. This method is used for error correction as a post processing method under the assumption of localized error point at preliminary experiment. The experiments prove that the correct conversion rate is improved based on rule method.

Page 291

Chinese Pinyin Input Method For Mobile Phone

Authors: Feng ZHANG, Zheng CHEN, Guozhong DAI ,Mingjing LI
Affiliation: IC,CAS,Beijing


Chinese input method is one of the most difficult problems in Chinese Language Processing. And to input Chinese word in mobile phone effectively is even more a big challenge. In this paper, we propose a new method to Chinese pinyin input method in mobile phone. This method uses a compact statistical bigram based language model. Also, to meet the special requirements of Chinese pinyin input in mobile phone, we introduce some new features for the search engine and user interface of our system.

Page 295