As non-native speakers become more frequent users of speech recognition applications, increasing the tolerance of the system with respect to non-native pronunciation and language use is important and is currently the focus of research in a variety of contexts. Dictionary modification, acoustic model adaptation, and acoustic model manipulation are a few of the techniques that have been reported successful in improving recognition of non-native speech. In this paper, we address the specific case of Japanese-accented English, describing the lexical and acoustic modeling techniques that give the best recognizer performance. We find that automatically generated pronunciation variants perform as well as hand-coded "golden" variants in reducing recognizer error, and that a significant improvement in system performance can be achieved with acoustic models retrained on a small amount of accented data.
Cite as: Tomokiyo, L.M. (2000) Lexical and acoustic modeling of non-native speech in LVSCR. Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000), vol. 4, 346-349, doi: 10.21437/ICSLP.2000-821
@inproceedings{tomokiyo00c_icslp, author={Laura Mayfield Tomokiyo}, title={{Lexical and acoustic modeling of non-native speech in LVSCR}}, year=2000, booktitle={Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000)}, pages={vol. 4, 346-349}, doi={10.21437/ICSLP.2000-821} }