INTERSPEECH 2004 - ICSLP
This paper describes experiments with a PLSA-based language model for conversational telephone speech. This model uses a long-range history and exploits topic information in the test text to adjust probabilities of test words. The PLSA-based model was found to lower test set perplexity over a traditional word+class-based 4-gram by 13% (optimistic estimate using a reference transcript as history) or by 6% (realistic estimate using recognised transcript as history). Moreover, this paper introduces a use of confidence scores to weight words in the history, a weight of the prior topic distribution and a way of calculating perplexity that accounts for recognition errors in the model context.
Bibliographic reference. Mrva, David / Woodland, Philip C. (2004): "A PLSA-based language model for conversational telephone speech", In INTERSPEECH-2004, 2257-2260.