ISCA Archive Interspeech 2009
ISCA Archive Interspeech 2009

On the development of matched and mismatched Italian children's speech recognition systems

Piero Cosi

While at least read speech corpora are available for Italian children’s speech research, there exist many languages which completely lack children’s speech corpora. We propose that learning statistical mappings between the adult and child acoustic space using existing adult/children corpora may provide a future direction for generating children’s models for such data deficient languages. In this work the recent advances in the development of the SONIC Italian children’s speech recognition system will be described. This work, completing a previous one developed in the past, was conducted with the specific goals of integrating the newly trained children’s speech recognition models into the Italian version of the Colorado Literacy Tutor platform. Specifically, children’s speech recognition research for Italian was conducted using the complete training and test set of the FBK (ex ITC-irst) Italian Children’s Speech Corpus (ChildIt). Using the University of Colorado SONIC LVSR system, we demonstrate a phonetic recognition error rate of 12,0% for a system which incorporates Vocal Tract Length Normalization (VTLN), Speaker-Adaptive Trained phonetic models, as well as unsupervised Structural MAP Linear Regression (SMAPLR).


doi: 10.21437/Interspeech.2009-195

Cite as: Cosi, P. (2009) On the development of matched and mismatched Italian children's speech recognition systems. Proc. Interspeech 2009, 540-543, doi: 10.21437/Interspeech.2009-195

@inproceedings{cosi09_interspeech,
  author={Piero Cosi},
  title={{On the development of matched and mismatched Italian children's speech recognition systems}},
  year=2009,
  booktitle={Proc. Interspeech 2009},
  pages={540--543},
  doi={10.21437/Interspeech.2009-195}
}