Fourth Workshop on Child, Computer and Interaction (WOCCI 2014)
Automatically recognising childrens speech is a very difficult task. This difficulty can be attributed to the high variability in childrens speech, both within and across speakers. The variability is due to developmental changes in childrens anatomy, speech production skills et cetera, and manifests itself, for example, in fundamental and formant frequencies, the frequency of disfluencies, and pronunciation quality. In this paper, we report the results of acoustic and auditory analyses of 3-10-year-old European Portuguese childrens speech. Furthermore, we are able to correlate some of the pronunciation error patterns revealed by our analyses such as the truncation of consonant clusters with the errors made by a childrens speech recogniser trained on speech collected from the same age group. Other pronunciation error patterns seem to have little or no impact on speech recognition performance. In future work, we will attempt to use our findings to improve the performance of our recogniser.
Index Terms: automatic speech recognition, childrens speech, acoustic analysis, auditory analysis, error analysis, European Portuguese, pronunciation quality
Bibliographic reference. Hämäläinen, Annika / Candeias, Sara / Cho, Hyongsil / Meinedo, Hugo / Abad, Alberto / Pellegrini, Thomas / Tjalve, Michael / Trancoso, Isabel / Dias, Miguel Sales (2014): "Correlating ASR errors with developmental changes in speech production: a study of 3-10-year-old European Portuguese childrens speech", In WOCCI-2014, 7-13.