ISCA & IEEE Workshop on Spontaneous Speech Processing and Recognition

April 13-16, 2003
Tokyo Institute of Technology, Tokyo, Japan

Variants Modeling in Korean Spontaneous Speech Recognition

Kyong-Nim Lee, Minhwa Chung

Dep. of Computer Science, Sogang University, Seoul, Korea

Pronunciation variants in spontaneous speech tend to be more variable in planned speech. Spontaneous speech has significant sources of variations as well as serious phonological variations, which make recognition extremely difficult. In this paper, we analyzed the auditory transcriptions of the dialogue for spontaneous speech recognition, and then classified the characteristics of conversational speech. To deal with these characteristics, we first used the special garbage model, the silence model and the filled pause model for the improvement the acoustic model; second, we optimized the multiple alternative pronunciations using the pruning method. Finally, for reflecting on freely the phonological variation, we enhanced the pronunciation lexicon by adding alternative pronunciation based on the frequently used phonological variants. Experimental results showed that modeling of garbage, silence, and filled pause reduce word error rate by a relatively 4.9%, while pruning the lexicon and adding the alternative pronunciation reduced word error rate by relatively 0.8%.

Full Paper

Bibliographic reference.  Lee, Kyong-Nim / Chung, Minhwa (2003): "Variants modeling in Korean spontaneous speech recognition", in SSPR-2003, paper MAP5.