11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Learning a Language Model from Continuous Speech

Graham Neubig, Masato Mimura, Shinsuke Mori, Tatsuya Kawahara

Kyoto University, Japan

This paper presents a new approach to language model construction, learning a language model not from text, but directly from continuous speech. A phoneme lattice is created using acoustic model scores, and Bayesian techniques are used to robustly learn a language model from this noisy input. A novel sampling technique is devised that allows for the integrated learning of word boundaries and an n-gram language model with no prior linguistic knowledge. The proposed techniques were used to learn a language model directly from continuous, potentially large-vocabulary speech. This language model was able to significantly reduce the ASR phoneme error rate over a separate set of test data, and the proposed lattice processing and lexical acquisition techniques were found to be important factors in this improvement.

Full Paper

Bibliographic reference.  Neubig, Graham / Mimura, Masato / Mori, Shinsuke / Kawahara, Tatsuya (2010): "Learning a language model from continuous speech", In INTERSPEECH-2010, 1053-1056.