SAPA-SCALE Conference 2012

Portland, OR, USA
September 7-8, 2012

Hierarchical Hybrid Language Models for Open Vocabulary Continuous Speech Recognition using WFST

M. Ali Basha Shaik, David Rybach, Stefan Hahn, Ralf Schlüter, Hermann Ney

Human Language Technology and Pattern Recognition, Computer Science Department, RWTH Aachen University, 52056 Aachen, Germany

One of the main challenges in automatic speech recognition is recognizing an open, partly unseen vocabulary. To implicitly reduce the out-of-vocabulary (OOV) rate, hybrid vocabularies consisting of full-words and sub-words are used. Nevertheless, when using subwords, OOV rates are not necessarily zero. In this work, we propose the use of separate character level graphones (orthography and phoneme sequence pair) as sub-words to effectively obtain zero OOV rate. To minimize negative effects on the core vocabulary of the most frequent words, a hierarchical language modeling approach is proposed. We augment the first level hybrid language model with an OOV word class, which is replaced by character level graphone sequences using a second-level graphone based character language and acoustic model during search. This approach is realized on-the-fly using weighted finite state transducers. We recognize a significant fraction of OOVs on the Wall Street Journal corpus, compared to the full-word and former hybrid language model based approaches.

Index Terms: open vocabulary, OOV, language model, filler models

Full Paper

Bibliographic reference.  Basha Shaik, M. Ali / Rybach, David / Hahn, Stefan / Schlüter, Ralf / Ney, Hermann (2012): "Hierarchical hybrid language models for open vocabulary continuous speech recognition using WFST", In SAPA-SCALE-2012, 46-51.