ISCA Archive ICSLP 1998
ISCA Archive ICSLP 1998

Statistical modeling of pronunciation and production variations for speech recognition

Filipp Korkmazskiy, Biing-Hwang Juang

In this paper, we propose a procedure for training a pronunciation network with criteria consistent with the optimality objectives for speech recognition systems. In particular, we describe a framework for using maximum likelihood(ML) and minimum classification error(MCE) criteria for pronunciation network optimization. The ML criterion is used to obtain an optimal structure for the pronunciation network based on statistically-derived phonological rules. Discrimination among different pronunciation networks is achieved by weighting of the pronunciation networks, optimized by applying the MCE criterion. Experiment results demonstrate improvements in speech recognition accuracy after applying statistically derived phonological rules. It is shown that the impact of the pronunciation network weighting on the recognition performance is determined by the size of the recognition vocabulary.


doi: 10.21437/ICSLP.1998-217

Cite as: Korkmazskiy, F., Juang, B.-H. (1998) Statistical modeling of pronunciation and production variations for speech recognition. Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998), paper 0345, doi: 10.21437/ICSLP.1998-217

@inproceedings{korkmazskiy98_icslp,
  author={Filipp Korkmazskiy and Biing-Hwang Juang},
  title={{Statistical modeling of pronunciation and production variations for speech recognition}},
  year=1998,
  booktitle={Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998)},
  pages={paper 0345},
  doi={10.21437/ICSLP.1998-217}
}