14thAnnual Conference of the International Speech Communication Association

Lyon, France
August 25-29, 2013

Written-Domain Language Modeling for Automatic Speech Recognition

Haşim Sak, Yun-hsuan Sung, Françoise Beaufays, Cyril Allauzen

Google, USA

Language modeling for automatic speech recognition (ASR) systems has been traditionally in the verbal domain. In this paper, we present finite-state modeling techniques that we developed for language modeling in the written domain. The first technique we describe is for the verbalization of written-domain vocabulary items, which include lexical and non-lexical entities. The second technique is the decomposition-recomposition approach to address the out-of-vocabulary (OOV) and the data sparsity problems with non-lexical entities such as URLs, email addresses, phone numbers, and dollar amounts. We evaluate the proposed written-domain language modeling approaches on a very large vocabulary speech recognition system for English. We show that the written-domain language modeling improves the speech recognition and the ASR transcript rendering accuracy in the written domain over a baseline system using a verbal-domain language model. In addition, the written-domain system is much simpler since it does not require complex and error-prone text normalization and denormalization rules, which are generally required for verbal-domain language modeling.

Full Paper

Bibliographic reference.  Sak, Haşim / Sung, Yun-hsuan / Beaufays, Françoise / Allauzen, Cyril (2013): "Written-domain language modeling for automatic speech recognition", In INTERSPEECH-2013, 675-679.