We propose a novel framework to detect and recognize out-of-vocabulary (OOV) words in automated speech recognition (ASR). In the proposed framework a hybrid language model combining words and sub-word units is incorporated during ASR decoding then three different OOV words recognition methods are applied to generate OOV word hypotheses. Specifically, dictionary lookup, morphological composition, and direct phoneme-to-grapheme. The proposed approach successfully reduced WER by 1.9% and 1.6% for ASR systems with recognition vocabularies of 30K and 219K. Moreover, the proposed approach correctly recognized 5% of OOV words.
Bibliographic reference. Bach, Nguyen / Noamany, Mohamed / Lane, Ian / Schultz, Tanja (2007): "Handling OOV words in Arabic ASR via flexible morphological constraints", In INTERSPEECH-2007, 2373-2376.