ISCA Archive Interspeech 2005
ISCA Archive Interspeech 2005

Data sampling for improved speech recognizer training

Takahiro Shinozaki, Mari Ostendorf, Les Atlas

Proper data selection for training a speech recognizer can be important for reducing costs of developing systems on new tasks and exploratory experiments, but it is also useful for efficient leveraging of the increasingly large speech resources available for training large vocabulary systems. In this work, we investigate various sampling methods, comparing the likelihood criterion to new acoustic measures motivated by work in child language acquisition. The acoustic criteria can be used with or without preexisting transcriptions or models. When applied to the problem of selecting a small training set, the best results are obtained using modulation spectrum features and a discriminant function trained on child vs. adult-directed speech. For large corpora, none of the methods outperforms random sampling, but reduced training costs are obtained by using multistage training and initializing with the small corpus.

doi: 10.21437/Interspeech.2005-551

Cite as: Shinozaki, T., Ostendorf, M., Atlas, L. (2005) Data sampling for improved speech recognizer training. Proc. Interspeech 2005, 1693-1696, doi: 10.21437/Interspeech.2005-551

  author={Takahiro Shinozaki and Mari Ostendorf and Les Atlas},
  title={{Data sampling for improved speech recognizer training}},
  booktitle={Proc. Interspeech 2005},