ISCA Archive Interspeech 2005
ISCA Archive Interspeech 2005

Improving out-of-coverage language modelling in a multimodal dialogue system using small training sets

Louis ten Bosch

For automatic speech recognition, the construction of an adequate language model may be difficult when only a limited amount of training text is available. Previous work has shown that in the case of small training sets statistical language models may outperform grammars on out-of-coverage utterances, while showing comparable performance on in-coverage input. In this paper, we compare the performance of an automatic speech recognition system using a grammar and a statistical language model including garbage models in the case of very limited in-domain training data. The results show that a bigram language model and a grammar show similar performance, and that the inclusion of garbage models in statistical language models enhances their performance both on in-coverage and out-of-coverage utterances.


doi: 10.21437/Interspeech.2005-404

Cite as: Bosch, L.t. (2005) Improving out-of-coverage language modelling in a multimodal dialogue system using small training sets. Proc. Interspeech 2005, 905-908, doi: 10.21437/Interspeech.2005-404

@inproceedings{bosch05_interspeech,
  author={Louis ten Bosch},
  title={{Improving out-of-coverage language modelling in a multimodal dialogue system using small training sets}},
  year=2005,
  booktitle={Proc. Interspeech 2005},
  pages={905--908},
  doi={10.21437/Interspeech.2005-404}
}