11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Discriminative Language Modeling Using Simulated ASR Errors

Preethi Jyothi, Eric Fosler-Lussier

Department of Computer Science and Engineering, Ohio State University, USA

In this paper, we approach the problem of discriminatively training language models using a weighted finite state transducer (WFST) framework that does not require acoustic training data. The phonetic confusions prevalent in the recognizer are modeled using a confusion matrix that takes into account information from the pronunciation model (word-based phone confusion log likelihoods) and information from the acoustic model (distances between the phonetic acoustic models). This confusion matrix, within the WFST framework, is used to generate confusable word graphs that serve as inputs to the averaged perceptron algorithm to train the parameters of the discriminative language model. Experiments on a large vocabulary speech recognition task show significant word error rate reductions when compared to a baseline using a trigram model trained with the maximum likelihood criterion.

Full Paper

Bibliographic reference.  Jyothi, Preethi / Fosler-Lussier, Eric (2010): "Discriminative language modeling using simulated ASR errors", In INTERSPEECH-2010, 1049-1052.