Second International Conference on Spoken Language Processing (ICSLP'92)

Banff, Alberta, Canada
October 13-16, 1992

Detection of Unknown Words and Automatic Estimation of Their Transcriptions in Continuous Speech Recognition

Itou Katunobu (1), Hayamizu Satoru (2), Tanaka Hozumi (1)

(1) Tokyo Institute of Technology, Meguro-ku, Tokyo, Japan
(2) Electrotechnical Laboratory, Tsukuba-shi, Ibaraki, Japan

Current continuous speech recognition systems are designed to recognize words within a vocabulary. In order to make speech recognition systems more flexible, convenient and robust, they should be able to process unknown words. This paper introduces a new technique for processing unknown words in a continuous speech recognition system. Two types of processing, one with stochastic language models without any other linguistic knowledge and the other with a dictionary and a grammar, are dynamically controlled to detect and transcribe the unknown words automatically. We tested this method by speaker independent continuous speech recognition experiments using a task with 113 word vocabulary, with bunsetu perplexity 8.2. Preliminary results showed a detection rate for 75% of the unknown words, with a false alarm rate of 11% and a phone recognition rate of 51% for the unknown words detected.

