The current speech interfaces in many military applications may be adequate for native speakers. However, the recognition rate drops quite a lot for non-native speakers (people with foreign accents). This is mainly because the non-native speakers have large temporal and intra-phoneme variations when they pronounce the same words. This problem is also complicated by the presence of loud environmental noise such as tank noise, helicopter noise, etc. In this paper, we proposed a novel speech feature adaptation algorithm for continuous accent and environmental adaptation. This feature-based adaptation method is then integrated with conventional model-based maximum likelihood linear regression (MLLR) algorithm. Extensive experiments have been performed on the NATO non-native speech corpus with baseline acoustic model trained on native American English. The proposed feature-based adaptation algorithm improved the average recognition accuracy by 15%, while the MLLR model-based adaptation achieved 11% improvement. The combined adaptation achieved overall recognition accuracy improvement of 29.5%, and word error rate reduction of 31.8%.
Cite as: Deng, Y., Li, X., Kwan, C., Xu, R., Raj, B., Stern, R.M., Williamson, D. (2006) An integrated approach to improve speech recognition rate for non-native speakers. Proc. Interspeech 2006, paper 1472-Wed2A2O.5, doi: 10.21437/Interspeech.2006-481
@inproceedings{deng06_interspeech, author={Y. Deng and X. Li and C. Kwan and R. Xu and B. Raj and Richard M. Stern and D. Williamson}, title={{An integrated approach to improve speech recognition rate for non-native speakers}}, year=2006, booktitle={Proc. Interspeech 2006}, pages={paper 1472-Wed2A2O.5}, doi={10.21437/Interspeech.2006-481} }