Improving Children’s Speech Recognition Through Out-of-Domain Data Augmentation

Joachim Fainberg, Peter Bell, Mike Lincoln, Steve Renals


Children’s speech poses challenges to speech recognition due to strong age-dependent anatomical variations and a lack of large, publicly-available corpora. In this paper we explore data augmentation for children’s speech recognition using stochastic feature mapping (SFM) to transform out-of-domain adult data for both GMM-based and DNN-based acoustic models. We performed experiments on the English PF-STAR corpus, augmenting using WSJCAM0 and ABI. Our experimental results indicate that a DNN acoustic model for childrens speech can make use of adult data, and that out-of-domain SFM is more accurate than in-domain SFM.


DOI: 10.21437/Interspeech.2016-1348

Cite as

Fainberg, J., Bell, P., Lincoln, M., Renals, S. (2016) Improving Children’s Speech Recognition Through Out-of-Domain Data Augmentation. Proc. Interspeech 2016, 1598-1602.

Bibtex
@inproceedings{Fainberg+2016,
author={Joachim Fainberg and Peter Bell and Mike Lincoln and Steve Renals},
title={Improving Children’s Speech Recognition Through Out-of-Domain Data Augmentation},
year=2016,
booktitle={Interspeech 2016},
doi={10.21437/Interspeech.2016-1348},
url={http://dx.doi.org/10.21437/Interspeech.2016-1348},
pages={1598--1602}
}