This paper describes a language model combination method for automatic speech recognition (ASR) systems based on Weighted Finite-State Transducers (WFSTs). The performance of ASR in real applications often degrades when an input utterance is out of the domain of the prepared language models. To cover a wide range of domains, it is possible to utilize a combination of multiple language models. To do this, we propose a language model combination method with a two-step approach; it first uses a union operation to incorporate all components into a single transducer and then merges states of the transducer to mix n-grams included in multiple models and to retain unique n-grams in each model simultaneously. The method has been evaluated in speech recognition experiments on travel conversation tasks and has demonstrated improvements in recognition performance.
Index Terms: Language model combination, WFST
Bibliographic reference. Yamamoto, Hitoshi / Dixon, Paul R. / Matsuda, Shigeki / Hori, Chiori / Kashioka, Hideki (2012): "Tied-state mixture language model for WFST-based speech recognition", In INTERSPEECH-2012, 174-177.