Sixth European Conference on Speech Communication and Technology
(EUROSPEECH'99)

Budapest, Hungary
September 5-9, 1999

A Monolingual Semantic Decoder Based on Word Sense Disambiguation for Mixed Language Understanding

Xiaohu Liu, Pascale Fung, Chi Shun Cheung

Human Language Technology Center, Department of Electronic and Electrical Engineering, University of Science and Technology, HKUST, Clear Water Bay, Hong Kong

In this paper, a new method for spoken mixed language understanding is presented. By mixed language, we mean that the words included in one sentence may come from different languages, a primary language and a secondary language. In conventional statistical semantic decoders, the conceptual structure is represented as a hidden Markov model, the decoding of the conceptual content of a sentence is carried out with the Viterbi algorithm. To handle mixed language, an unsupervised word sense disambiguation module is proposed to convert the secondary language words into the primary language. The approach is evaluated in the ATIS domain, where the primary language is English and we assume the secondary language is Chinese. The average accuracy of our extended semantic decoder is 26% higher than the accuracy of the baseline semantic decoder. The advantages of the extended semantic decoder are (1) it can handle mixed language input, and (2) it needs neither secondary language training data nor mixed language training data. The approach can be used for any main-secondary language pairs.


Full Paper (PDF)   Gnu-Zipped Postscript

Bibliographic reference.  Liu, Xiaohu / Fung, Pascale / Cheung, Chi Shun (1999): "A monolingual semantic decoder based on word sense disambiguation for mixed language understanding", In EUROSPEECH'99, 2011-2014.