Language Model Data Augmentation for Keyword Spotting in Low-Resourced Training Conditions

Arseniy Gorin, Rasa Lileikytė, Guangpu Huang, Lori Lamel, Jean-Luc Gauvain, Antoine Laurent


This research extends our earlier work on using machine translation (MT) and word-based recurrent neural networks to augment language model training data for keyword search in conversational Cantonese speech. MT-based data augmentation is applied to two language pairs: English-Lithuanian and English-Amharic. Using filtered N-best MT hypotheses for language modeling is found to perform better than just using the 1-best translation. Target language texts collected from the Web and filtered to select conversational-like data are used in several manners. In addition to using Web data for training the language model of the speech recognizer, we further investigate using this data to improve the language model and phrase table of the MT system to get better translations of the English data. Finally, generating text data with a character-based recurrent neural network is investigated. This approach allows new word forms to be produced, providing a way to reduce the out-of-vocabulary rate and thereby improve keyword spotting performance. We study how these different methods of language model data augmentation impact speech-to-text and keyword spotting performance for the Lithuanian and Amharic languages. The best results are obtained by combining all of the explored methods.


DOI: 10.21437/Interspeech.2016-1200

Cite as

Gorin, A., Lileikytė, R., Huang, G., Lamel, L., Gauvain, J., Laurent, A. (2016) Language Model Data Augmentation for Keyword Spotting in Low-Resourced Training Conditions. Proc. Interspeech 2016, 775-779.

Bibtex
@inproceedings{Gorin+2016,
author={Arseniy Gorin and Rasa Lileikytė and Guangpu Huang and Lori Lamel and Jean-Luc Gauvain and Antoine Laurent},
title={Language Model Data Augmentation for Keyword Spotting in Low-Resourced Training Conditions},
year=2016,
booktitle={Interspeech 2016},
doi={10.21437/Interspeech.2016-1200},
url={http://dx.doi.org/10.21437/Interspeech.2016-1200},
pages={775--779}
}