8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003


Modeling Cross-Morpheme Pronunciation Variations for Korean Large Vocabulary Continuous Speech Recognition

Kyong-Nim Lee, Minhwa Chung

Sogang University, Korea

In this paper, we describe a cross-morpheme pronunciation variation model which is especially useful for constructing morpheme-based pronunciation lexicon for Korean LVCSR. There are a lot of pronunciation variations occurring at morpheme boundaries in continuous speech. Since phonemic context together with morphological category and morpheme boundary information affect Korean pronunciation variations, we have distinguished pronunciation variation rules according to the locations such as within a morpheme, across a morpheme boundary in a compound noun, across a morpheme boundary in an eojeol, and across an eojeol boundary. In 33K-morpheme Korean CSR experiment, an absolute improvement of 1.16% in WER from the baseline performance of 23.17% WER is achieved by modeling cross-morpheme pronunciation variations with a context-dependent multiple pronunciation lexicon.

Full Paper

Bibliographic reference.  Lee, Kyong-Nim / Chung, Minhwa (2003): "Modeling cross-morpheme pronunciation variations for korean large vocabulary continuous speech recognition", In EUROSPEECH-2003, 261-264.