In speech recognition, when a system created for one application is used for another or for a different population of users, large amounts of data and engineering effort are needed to "adapt" it to its new use. Much work has recently centered on reducing that effort. This paper concerns changing from an adult to a child population of users in a system that pinpoints pronunciation errors in English. It first discusses childrens speech production. Then it describes adaptation that is centered around a combination of relatively small amounts of data with minimal recognizer changes for a system that can pinpoint errors as well for childrens speech as it does for adults.
The precision of the adult system was tested on childrens speech. Then Open Source SPHINX was tested on childrens speech and tests were run, using a variety of parameters, that compared the precision of automatic pinpointing of recognition errors to human tutor pinpointing of errors. The various parameters tested, the test conditions, and results are discussed.
Cite as: Eskenazi, M., Pelton, G. (2002) Pinpointing pronunciation errors in children's speech: examining the role of the speech recognizer. Proc. ITRW on Pronunciation Modeling and Lexicon Adaptation for Spoken Language Technology (PMLA 2002), 48-52
@inproceedings{eskenazi02_pmla, author={Maxine Eskenazi and Gary Pelton}, title={{Pinpointing pronunciation errors in children's speech: examining the role of the speech recognizer}}, year=2002, booktitle={Proc. ITRW on Pronunciation Modeling and Lexicon Adaptation for Spoken Language Technology (PMLA 2002)}, pages={48--52} }