Fourth Workshop on Child, Computer and Interaction (WOCCI 2014)

September 19, 2014

Improving Speech Recognition for Children using Acoustic Adaptation and Pronunciation Modeling

Prashanth Gurunath Shivakumar (1), Alexandros Potamianos (2), Sungbok Lee (1), Shrikanth Narayanan (1)

(1) Signal Analysis and Interpretation Laboratory, University of Southern California, Los Angeles, USA
(2) School of ECE, National Technical University of Athens, Athens, Greece

Developing a robust Automatic Speech Recognition (ASR) sys- tem for children is a challenging task because of increased vari- ability in acoustic and linguistic correlates as function of young age. The acoustic variability is mainly due to the developmen- tal changes associated with vocal tract growth. On the linguis- tic side, the variability is associated with limited knowledge of vocabulary, pronunciations and other linguistic constructs. This paper presents a preliminary study towards better acous- tic modeling, pronunciation modeling and front-end processing for children’s speech. Results are presented as a function of age. Speaker adaptation significantly reduces mismatch and variabil- ity improving recognition results across age groups. In addition, introduction of pronunciation modeling shows promising per- formance improvements.

Index Terms: automatic speech recognition, acoustic model- ing, pronunciation modeling, acoustic adaptation, front-end fea- tures

Full Paper

Bibliographic reference.  Shivakumar, Prashanth Gurunath / Potamianos, Alexandros / Lee, Sungbok / Narayanan, Shrikanth (2014): "Improving speech recognition for children using acoustic adaptation and pronunciation modeling", In WOCCI-2014, 15-19.